Skip to content
  • There are no suggestions because the search field is empty.

Migrate from legacy SQL metrics to datasets

Transition SQL-based custom metrics to datasets to reduce database load, consolidate queries, and unlock more flexible reporting.

Why migrate from legacy SQL metrics

Legacy SQL metric builders are being deprecated, along with all custom metrics created using them for the following integrations:

  • Amazon Redshift
  • Google BigQuery
  • Microsoft Azure SQL
  • Microsoft SQL Server
  • MySQL
  • PostgreSQL
  • Snowflake 

Going forward, SQL metrics will be created using datasets, which provide a more scalable and flexible way to work with your SQL data.

With legacy SQL metrics, each metric runs its own query on every sync. If you have 200 metrics, that means 200 separate queries hitting your database every time data refreshes. This increases database load, slows down syncs, and makes ongoing maintenance more complex.

With datasets, syncs are tied to datasets instead of individual metrics. One dataset query can power dozens or even hundreds of metrics. This significantly reduces database load and simplifies long-term management.

By migrating to datasets, you can:

  • Store up to 20x more data
  • Reuse the same dataset across multiple metrics
  • Analyze data across multiple dimensions
  • Apply filters and drill-downs directly in visualizations
  • Create custom calculated columns
  • Merge datasets from multiple data sources
  • Export raw data when needed
  • Use AI-powered insights with Genie and Databox MCP
  • Improve refresh performance and maintainability

pinNote: In-app migration tools will be available soon to help identify legacy data sources and replace legacy SQL metrics. You may still choose to migrate manually based on your timeline or requirements, but the tooling will provide a more streamlined and reliable migration experience once available.

Key considerations for migration

What's changing

Previously, each metric had its own SQL query that ran independently on every sync. Now, SQL queries are defined at the Dataset level, creating a data preparation step between your database and your metrics.

One dataset query powers multiple metrics, and a single sync refreshes all metrics built on that dataset. Dimension filtering now happens at the metric level instead of in SQL WHERE clauses.

What stays the same

Your existing database connection, account content, and historical data remain unchanged. The key difference is consolidating many per-metric queries into fewer, broader dataset queries.

Manual migration steps

The biggest benefit of datasets comes from consolidating queries.

Instead of migrating each metric 1:1, look for patterns where multiple metrics query the same table with different filters.

Step 1: Identify legacy SQL metrics

  1. Go to Metrics > Custom Metrics.
  2. Filter by your SQL data source (PostgreSQL, MySQL, Google BigQuery, etc.).
  3. Look for metrics marked with a ⚠️ warning icon.
  4. Open each metric and copy the SQL query for reference. 

Step 2: Identify consolidation patterns

Look for metrics that:

  • Query the same table or set of tables
  • Use the same or very similar JOIN structure
  • Share the same base dataset but differ in filters, date granularity, aggregations, or calculated logic

Common patterns include:

Pattern Example Consolidation approach
Same query, different filter values WHERE location_id = 101 vs WHERE location_id = 102 Remove the hardcoded filter. Include location_id in SELECT. Apply filtering at the metric level.
Separate queries for different aggregations One metric uses SUM(revenue), another uses COUNT(order_id) from the same table Include all required measures in the same dataset query. Create separate metrics using different value columns or aggregations.
Separate queries for calculated variations One metric calculates SUM(revenue), another calculates SUM(revenue) - SUM(cost) Return base columns (such as revenue and cost) in the dataset. Create calculated metrics or calculated columns instead of separate SQL queries.
Separate queries for different date grains One query groups by DATE(order_date), another by DATE_TRUNC('month', order_date) Return data at the lowest useful grain (for example daily). Handle aggregation at the metric or visualization level.

Step 3: Create consolidated datasets

  1. Go to Data Manager > Datasets.
  2. Click + New Dataset.
  3. Select your SQL data source and click Continue.
  4. Select Write SQL.
  5. Enter your consolidated SQL query.
  6. Click Preview data to confirm the structure and formatting look correct and the data appears as expected.
  7. Click Continue.
  8. Select update frequency and click Finish.

Step 4: Recreate metrics from the datasets

For each legacy metric:

  1. Open the dataset view.
  2. Click + Create new metric on the right-hand panel.
  3. Select the appropriate measure, date, and dimension columns.
  4. Reapply any logic that previously lived in the SQL query, such as:
    1. Adding filters that were previously in the WHERE clause
    2. Choosing the correct aggregation
  5. Click Preview data and compare the new metric with the legacy metric to ensure the numbers align.
  6. Click Save.
  7. Replace the old metric wherever it’s used across your content to ensure everything continues working with the new dataset-based metric.

Consolidation example

Before: Separate queries per location

SELECT order_date::DATE AS "Date", SUM(total_amount) AS "Revenue"
FROM orders
WHERE location_id = 101
GROUP BY order_date::DATE
ORDER BY "Date" DESC;

Same structure repeated for each location.

After: One dataset query

SELECT
    o.order_date::DATE AS "Date",
    l.location_id,
    l.location_name,
    SUM(o.total_amount) AS "Revenue"
FROM orders o
JOIN locations l ON o.location_id = l.location_id
GROUP BY o.order_date::DATE, l.location_id, l.location_name
ORDER BY "Date" DESC;

Now you can filter by location_id or location_name at the metric level.

Result: One query replaces many. New locations appear automatically without modifying SQL queries.

Best practices

When creating consolidated datasets, focus on flexibility, clarity, and long-term maintainability.

Structure your query for flexibility

Include all columns that may be used for filtering, grouping, comparison, or future reporting in the SELECT statement.

Instead of hardcoding values in a WHERE clause or excluding supporting fields, return those columns in your dataset so they become reusable dimensions. This makes the dataset more flexible and reduces the need to modify the SQL query later.

Keep datasets focused

Avoid combining many unrelated tables into one large query. Overly broad datasets are harder to maintain and can slow down performance.

Instead, group related metrics into focused datasets, such as:

  • Sales and transactions
  • Budget and goals
  • Customer activity
  • Employee performance

Avoid unnecessary TOP or LIMIT clauses

Legacy metrics often included TOP or LIMIT clauses as safeguards.

Datasets support significantly larger volumes of data, so artificial limits are usually unnecessary. Remove them unless they are required for performance or business logic.

Frequently Asked Questions

Do I have to migrate everything at once?

No. You can migrate metrics at your own pace before the deprecation takes effect.

Group related metrics and migrate them together as a consolidated dataset. This makes it easier to validate results and ensure everything works correctly before moving on to the next group.

Should I consolidate metrics that use different databases?

No. Each dataset connects to a single data source. Metrics that pull from different databases must be created as separate datasets.

If you need to analyze data across multiple sources, you can merge datasets after they are created.

What if my consolidated dataset returns too much data?

Datasets support a 20x larger data size limit compared to legacy SQL metrics. In most cases, removing a WHERE clause filter and adding the column as a dimension will still fit within the limit. 

If you reach the size limit, you can:

  • Add a date range filter in your query (for example, limit data to the last two years)
  • Split the query into multiple focused datasets instead of one broad dataset
  • Reduce granularity where appropriate, such as aggregating older data weekly instead of daily

    What if my queries are too different to consolidate?

    Not all metrics should be combined into a single dataset. If metrics rely on different tables or fundamentally different JOIN structures, keep them as separate datasets.

    Even partial consolidation delivers significant benefits. For example, reducing 100 individual queries to 10 dataset queries dramatically lowers database load and simplifies long-term maintenance.

    Still need help?

    Visit our community, send us an email, or start a chat in Databox.