Create a dataset
A step-by-step guide to creating datasets in Databox, covering data selection, update frequencies, and setup for accurate and efficient reporting.
Last updated:
Availability
Users, Editors, and Admins
All accounts
Feature exclusive to select subscription plans
Datasets provide a structured way to store and manage data, ensuring that only the most relevant and valuable information is included for reporting. By curating datasets to focus on key metrics and dimensions, you can gain deeper insights, track performance more effectively, and customize visualizations to align with your specific business needs. Whether you’re working with manually uploaded data or integrating with external sources, datasets give you greater flexibility and control over how your data is used in Databox.
Create a dataset
You can create a dataset from the following locations:
- Data Manager > Datasets: Click + New Dataset
- Account Management > Data Manager > Datasets: Click + Add Dataset
After selecting an existing data source or adding a new one, the dataset setup window will open. In this window, you’ll first need to select the data to include in the dataset, and then specify its data update frequency.
Select data
In this stage, you will choose the view and associated columns to be included in your dataset.
- Views: These are categories of data that typically correspond to specific objects, modules, or components within your data source. Examples include:
- Deals (HubSpot)
- Orders (Shopify)
- Subscriptions (Stripe)
- Invoices (QuickBooks)
- Columns: These refer to the individual fields or attributes within each view. For instance:
- Deal Amount (HubSpot)
- Order Date (Shopify)
- Customer Email (Stripe)
- Invoice Balance (QuickBooks)
If you are working with a compatible SQL integration, you can also enter a custom query, and its output will be used to create the dataset. This approach is advisable when more complex transformations are necessary to generate the desired dataset.
Update frequency

Next, you’ll select how often Databox should refresh the data in the dataset. The available synchronization frequencies depend on the technical capabilities of the chosen integration and your account subscription.
Once you’ve chosen the desired frequency, click Save to finalize the setup and initiate a full synchronization of the selected view and columns in the backend.
Note: Databox will generate a preview as soon as the data becomes available. If syncing takes longer than 30 seconds, a message will appear, and you’ll receive an email notification when your data is ready.
Curate the dataset
To ensure your dataset includes all relevant data for reporting:
- Define key metrics and insights the dataset should support.
- Identify critical data points needed to calculate those metrics, including:
- Relevant measures (e.g., sales, revenue, conversions).
- Dates for tracking trends over time.
- Dimensions for segmenting data (e.g., region, product category).
- Additional data necessary for drill-down or deeper analysis.
Caution: To create metrics from a dataset, at least one datetime column must be present, as it serves as the basis for the date range selector within the application.
It's equally important to ensure the dataset captures granular data where needed while excluding unnecessary fields that could add complexity without contributing value. By focusing on the right data, you can ensure the dataset remains both comprehensive and efficient for reporting purposes.
Edit the dataset
To modify your dataset, click Edit Data in the view mode. Then, click the down arrow next to a column name to access the actions menu. The following options are available:
- Set data type: Choose a new type for the column if there is a mismatch or conversion is needed.
- Filter: Add one or more filters to include or exclude specific rows.
You can also rename columns by clicking the column name box and entering a new name.
Once you're satisfied with your changes, click Save changes in the top-right corner of the screen.
Frequently Asked Questions
Can datasets be merged or combined into a single dataset?
This feature is currently being developed and is anticipated to be launched in May 2025.
Can I combine AND and OR filters on a dataset column?
No, only one filter type (either AND or OR) can be applied to a column at once.
How much data can a dataset contain?
Each dataset can hold a maximum of 100 columns, and its total size cannot exceed 100MB.
Still need help?
Visit our community, send us an email, or start a chat in Databox.