Add a calculated column to a dataset

Availability

Users, Editors, and Admins

All accounts

Feature exclusive to select subscription plans

Calculated columns let you create new values in your dataset by applying custom logic or transformations to your existing data. This is especially useful when you need to:

Normalize or reformat values for comparison or visualization
Combine multiple columns into a single result
Apply conditional logic (e.g., flagging results based on thresholds)
Perform mathematical operations that aren't available by default

Instead of modifying your source data, calculated columns give you a flexible way to enrich it — right inside the dataset editor.

Add a calculated column to a dataset

Navigate to the Datasets page and find the dataset you want to update.
Click the down arrow () next to the dataset name and select Edit data.
In the dataset editing mode, click the + bar at the far right of the table to open the formula input field above the dataset.
Enter your formula using supported functions, operators, column references, and values.
Click Run to preview how the new column will look across all rows.
If everything looks good, click Save in the top right corner to apply your changes.

Helpful shortcuts

To speed up formula building and reduce the chance of errors, use the following keyboard shortcuts:

Ctrl + Space – Opens a list of all supported functions you can use in formulas.
$ – Displays a dropdown of available columns you can insert as parameters.

These shortcuts make it easier to build accurate formulas without needing to memorize every function or column name.

Formula validation and supported inputs

The formula field only accepts specific input types. If your formula contains unsupported elements or syntax errors, an error message will be displayed. You’ll need to correct the formula before you can run or save it.

Supported inputs include:

Functions
Operators
Numeric values* (e.g., 10, 3.14)
Text values, enclosed in single quotes (e.g., 'Completed')
Column references, using the $ symbol (e.g., $Status), and double quotes ($"Project-Name") if the column name includes spaces or special characters

* In dataset columns, logical values are automatically cast as numbers — true becomes 1, and false becomes 0

Any input outside of these types will be flagged as invalid.

Caution: Certain characters — such as parentheses, commas, or symbols — are reserved and may cause issues when parsing or validating your formula. To avoid errors, make sure your column names don’t include special characters, or rename columns before referencing them in a formula.

To help you get started, all supported operators and functions are organized by category below.

Operators

Operators perform basic operations on values or expressions. You can think of them as the "glue" that connects, compares, or transforms values within formulas.

Below is a quick overview of the different types of operators. Click a category below to jump to its full explanation.

Arithmetic Operators

Comparison Operators

Logical and Conditional Operators

Arithmetic Operators

These are basic math symbols used to calculate values.

If formulas are recipes, numeric operators are the math utensils you use to prepare your ingredients.

Operator	Description	Example	Result
`+`	Addition	`2 + 3`	`5`
`-`	Subtraction	`2 - 3`	`-1`
`*`	Multiplication	`2 * 3`	`6`
`/`	Division	`5 / 2`	`2.5`
`//`	Integer Division (quotient)	`5 // 2`	`2`
`%`	Modulo (remainder)	`5 % 4`	`1`
`^`	Exponent	`3 ^ 4`	`81`
`!`	Factorial	`4!`	`24`

Comparison Operators

They evaluate the relationship between two values and return true or false. They are commonly used in formulas to compare numbers, text, or column values.

You can think of comparison operators like yes/no questions you ask about your data to evaluate relationships between values.

Operator	Description	Example	Result
<	Less than	`5 < 10`	`true`
>	Greater than	`5 > 10`	`false`
<=	Less than or equal to	`5 <= 5`	`true`
`>=`	Greater than or equal to	`5 >= 6`	`false`
=	Equal	`5 = 5`	`true`
!=	Not equal	`5 != 3`	`true`

Logical and Conditional Operators

These operators evaluate conditions and return true/false or determine outcomes based on logic.

Logical operators like AND, OR, and NOT help you build complex conditions by combining simpler ones.
Conditional operators like CASE or IS NULL allow you to return specific values depending on which conditions are met.

You can think of logical operators as the building blocks of a test, and conditional expressions as the final decision maker.

Operator	Description
`AND`	Returns `true` if both conditions are true. Example: `(5 > 3) AND (2 < 4)` → `true`
`CASE WHEN ... THEN ... [ELSE ...] END`	Returns the result corresponding to the first true condition. If no conditions are true, returns the `ELSE` result (if provided). Example: `CASE WHEN 5 > 10 THEN 'High' WHEN 5 > 3 THEN 'Medium' ELSE 'Low' END` → `'Medium'`
`IS NOT NULL`	Returns `true` if the value is not null. Example: `'abc' IS NOT NULL` → `true`
`IS NULL`	Returns `true` if the value is null. Example: `NULL IS NULL` → `true`
`NOT`	Returns `true` if the condition is false, and `false` if the condition is true. Example: `NOT(5 > 3)` → `false`
`OR`	Returns `true` if either condition is true. Example: `(5 > 3) OR (2 > 4)` → `tru`

Functions

Functions perform more advanced, often reusable operations. They take one or more inputs (called arguments) and return a calculated result.

Below is a categorized overview of available functions. Click a category below to jump to its detailed section.

Aggregate Functions

Date & Time Functions

Numeric Functions

Text Functions

Utility Functions

Aggregate Functions

These perform calculations across multiple rows in a dataset, such as summing values or finding an average.

You can think of aggregate functions as "summary tools" — they look at all your data and give you a big-picture number.

Syntax	Description
`AVG(expression)`	Calculates the average (arithmetic mean) of all non-null values in `expression` across the dataset. Example: `AVG([5, 10, 15])` → `10`
`COUNT(expression)`	Counts the number of non-null values in `expression` across all rows. If `expression` is omitted, counts all rows. Example: `COUNT(['A', 'B', null])` → `2`
`COUNTDISTINCT(expression)`	Counts the number of unique (distinct) non-null values in `expression` across all rows. Example: `COUNTDISTINCT(['A', 'B', 'A'])` → `2`
`MAX(expression)`	Returns the maximum value from `expression` across all rows. Example: `MAX([10, 20, 5])` → `20`
`MEDIAN(expression)`	Returns the median value from `expression` across all rows. Example: `MEDIAN([1, 3, 5])` → `3`
`MIN(expression)`	Returns the minimum value from `expression` across all rows. Example: `MIN([10, 20, 5])` → `5`
`STDEV(expression)`	Returns the sample standard deviation of values in `expression` across all rows. Example: `STDEV([4, 6, 8])` → `2`
`STDEVP(expression)`	Returns the population standard deviation of values in `expression` across all rows. Example: `STDEVP([4, 6, 8])` → `1.63` (approx.)
`SUM(expression)`	Returns the sum of all values in `expression` across the dataset. Example: `SUM([5, 10, 15])` → `30`
`VARIANCE(expression)`	Returns the sample variance of values in `expression` across all rows. Example: `VARIANCE([4, 6, 8])` → `4`
`VARIANCEP(expression)`	Returns the population variance of values in `expression` across all rows. Example: `VARIANCEP([4, 6, 8])` → `2.67` (approx.)

Date and Time Functions

Used to extract parts of a date, perform date math, or return the current time.

These are your "calendar helpers" — they help you understand when things happen or calculate time-based differences.

Syntax	Description
`DATEADD(date, number, interval)`	Returns a new date by adding the specified `number` of `interval` units (e.g., 'day', 'month') to the given `date`. Example: `DATEADD('2025-04-01', 7, 'day')` → `'2025-04-08'`
`DATEDIFF(start_date, end_date, unit)`	Returns the number of `unit` intervals (e.g., 'day', 'month') between `start_date` and `end_date`. Example: `DATEDIFF('2025-01-01', '2025-01-10', 'day')` → `9`
`DATEFORMAT(date, format)`	Returns a text string representing the specified `date` in the given format. Example: `DATEFORMAT('2025-04-15', '%b %-d, %Y')` → `Apr 15, 2025`
`DAY(date)`	Returns the day of the month (1–31) from the given `date`. Example: `DAY('2025-04-15')` → `15`
`HOUR(date_time)`	Returns the hour (0–23) from the given `date_time`. Example: `HOUR('2025-04-15T14:30:00Z')` → `14`
`MINUTE(date_time)`	Returns the minute (0–59) from the given `date_time`. Example: `MINUTE('2025-04-15T14:30:00Z')` → `30`
`MONTH(date)`	Returns the month (1–12) from the given `date`. Example: `MONTH('2025-04-15')` → `4`
`NOW()`	Returns the current date and time in Coordinated Universal Time (UTC). Example: `NOW()` → `'2025-04-23T10:42:00Z'`
`QUARTER(date)`	Returns the quarter (1–4) of the year for the given `date`. Example: `QUARTER('2025-04-15')` → `2`
`SECOND(date_time)`	Returns the seconds (0–59) from the given `date_time`. Example: `SECOND('2025-04-15T14:30:45Z')` → `45`
`TODAY()`	Returns the current date with time set to `00:00:00` in Coordinated Universal Time (UTC). Example: `TODAY()` → `'2025-04-23'`
`WEEK(date)`	Returns the ISO week number (1–53) for the given `date`. Example: `WEEK('2025-01-01')` → `1`
`WEEKDAY(date)`	Returns the day of the week (0 = Sunday, 6 = Saturday) for the given `date`. Example: `WEEKDAY('2025-04-21')` → `1`
`YEAR(date)`	Returns the year from the given `date`. Example: `YEAR('2025-04-15')` → `2025`

Numeric Functions

These apply mathematical formulas to one or more values.

Numeric functions are like a calculator inside your dataset — they help you transform and analyze numerical values.

Syntax	Description
`ABS(number)`	Returns the absolute (non-negative) value of `number`. Example: `ABS(-5)` → `5`
`CEILING(number)`	Rounds `number` up to the nearest integer. Example: `CEILING(4.2)` → `5`
`FLOOR(number)`	Rounds `number` down to the nearest integer. Example: `FLOOR(4.8)` → `4`
`MOD(dividend, divisor)`	Returns the remainder after dividing `dividend` by `divisor`. Example: `MOD(10, 3)` → `1`
`POWER(number, exponent)`	Returns `number` raised to the power of `exponent`. Example: `POWER(2, 3)` → `8`
`SQRT(number)`	Returns the square root of `number`. Example: `SQRT(9)` → `3`

Text Functions

Used to manipulate, transform, or extract information from text strings.

These are your "text editors" — perfect for cleaning up or formatting names, tags, and other string data.

Syntax	Description
`CONCAT(value1, value2, ...)`	Joins multiple values into a single string. Example: `CONCAT('Data', 'box')` → `Databox`
`CONTAINS(string, substring)`	Returns `1` if the `string` contains the specified `substring`; otherwise, returns `0`. Example: `CONTAINS('dashboard', 'board')` → `1`
`ENDSWITH(string, substring)`	Returns `1` if the `string` ends with `substring`; otherwise, returns `0`. Example: `ENDSWITH('metrics.csv', '.csv')` → `1`
`FIND(string, substring)`	Returns the index of the first occurrence of `substring` in `string`. Returns `0` if not found. Example: `FIND('dashboard', 'board')` → `5`
`JSONEXTRACT(json_string, path)`	Returns the fragment located at the specified JSONPath in the `json_string`. Example: `JSONEXTRACT('{"user":{"id":123,"name":"Alice"}}', '$.user.id')` → `123`
`LEFT(string, num_chars)`	Returns the first `num_chars` characters from the `string`. Example: `LEFT('Databox', 4)` → `Data`
`LENGTH(string)`	Returns the number of characters in the `string`. Example: `LENGTH('Databox')` → `7`
`LOWER(string)`	Converts all characters in the `string` to lowercase. Example: `LOWER('Databox')` → `databox`
`REGEXEXTRACT(string, pattern, [group])`	Returns the specified match `group` (0-indexed) from the `string`, based on the regular expression `pattern`. If `group` is omitted, it defaults to `0`. Example: `REGEXEXTRACT('databox.com', '([a-z]+)\.com')` → `databox`
`REGEXREPLACE(string, pattern, new_substring)`	Returns `string` with all text matching the regular expression `pattern` replaced by `new_substring`. Example: `REGEXREPLACE('abc123def', '([0-9]+)', 'X')` → `abcXdef`
`REPLACE(string, old_substring, new_substring)`	Replaces all occurrences of `old_substring` in the `string` with `new_substring`. Example: `REPLACE('2025-05-01', '-', '/')` → `2025/05/01`
`RIGHT(string, num_chars)`	Returns the last `num_chars` characters from the `string`. Example: `RIGHT('Databox', 3)` → `box`
`SPLITROWS(string, separator)`	Returns one row for each value in the `string`, split by the specified `separator`. Example: `SPLITROWS('a;b;c', ';')` → `['a', 'b', 'c']`
`STARTSWITH(string, substring)`	Returns `1` if the `string` starts with `substring`; otherwise, returns `0`. Example: `STARTSWITH('dashboard', 'dash')` → `1`
`SUBSTRING(string, start, [length])`	Extracts a substring from the `string` starting at the `start` index, with an optional `length`. Example: `SUBSTRING('Databox', 5, 3)` → `box`
`TRIM(string)`	Removes whitespace from the beginning and end of the `string`. Example: `TRIM(' Databox ')` → `Databox`
`UPPER(string)`	Converts all characters in the `string` to uppercase. Example: `UPPER('Databox')` → `DATABOX`

Utility Functions

Simple but powerful functions for handling nulls and basic conditionals.

Utility functions are the "if-this-then-that" tools — they help you make quick decisions in your formulas or set fallbacks.

Syntax Description

IF(condition, true_value, false_value)

Returns true_value if condition is true; otherwise, returns false_value.

Example: IF(5 > 3, 'Yes', 'No') → 'Yes'

IFNULL(expression, value_if_null)

Returns expression if it is not null; otherwise, returns value_if_null.

Example: IFNULL(null, 'N/A') → 'N/A'

Frequently Asked Questions

Can I reference multiple columns in the same formula?

Yes. You can include as many column references as needed by using the $ symbol before each column name (e.g., $Revenue - $Cost).

Can I use the calculated column to create custom metrics?

Absolutely. Once saved, calculated columns behave like any other column in your dataset and can be used in metric creation.

Will the calculated column update automatically if the source data changes?

Yes. The calculated column is dynamic — it will recalculate automatically whenever the underlying dataset is refreshed.

Still need help?

Visit our community, send us an email, or start a chat in Databox.

Add a calculated column to a dataset

Learn how to create new columns in your dataset using custom formulas. Includes helpful tips, examples, and guidance for effective use.

Add a calculated column to a dataset

Helpful shortcuts

Formula validation and supported inputs

Operators

Arithmetic Operators

Comparison Operators

Logical and Conditional Operators

Functions

Aggregate Functions

Date and Time Functions

Numeric Functions

Text Functions

Utility Functions

Frequently Asked Questions

Can I reference multiple columns in the same formula?

Can I use the calculated column to create custom metrics?

Will the calculated column update automatically if the source data changes?

Still need help?