sqlmesh-migrations
Authors and runs SQLMesh - data-transformation framework with version control, virtual data environments, automatic breaking-vs-non-breaking change classification, and downstream impact analysis; supports `sqlmesh init` / `plan` / `apply` / `run` / `audit` / `test` lifecycle; covers DuckDB, Postgres, Snowflake, BigQuery, Redshift, Databricks. Use when the user works with SQL data pipelines (warehouse + dbt-adjacent ELT), needs safer model evolution than dbt's deploy-and-pray, or wants the strongest impact-analysis story in the OSS data tooling space.
sqlmesh-migrations
Overview
Per sqlmesh.readthedocs.io/en/stable/quickstart/cli/:
SQLMesh is a data transformation tool that enables version control and testing for SQL pipelines. Its distinguishing features vs schema-only tools (Flyway / Liquibase / Atlas):
When to use
Step 1 - Install
Per sm-cli:
pip install sqlmeshOptionally with extras for specific engines (e.g., pip install sqlmesh[bigquery]).
Step 2 - Initialize a project
Per sm-cli:
sqlmesh init <dialect>
# example:
sqlmesh init duckdbGenerates project skeleton: config.yaml, models/, macros/, tests/, audits/, seeds/.
Step 3 - Author a model
A SQLMesh model is a SQL (or Python) file in models/ with a MODEL directive header:
-- models/sales/orders_summary.sql
MODEL (
name sales.orders_summary,
kind FULL,
cron '@daily',
owner 'data-team@example.com',
description 'Daily summary of orders by customer'
);
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(amount) AS total_amount
FROM sales.orders
GROUP BY customer_id;Model kinds (per SQLMesh docs): FULL (rebuild every run), INCREMENTAL_BY_TIME_RANGE (process new time partitions), INCREMENTAL_BY_UNIQUE_KEY (upsert by key), VIEW (no persisted table), SEED (static data).
Step 4 - Plan + apply (the core workflow)
Per sm-cli:
sqlmesh plan devplan shows:
The user reviews the plan, then confirms - plan is integrated with apply: confirming the plan applies it.
sqlmesh plan prod # promote dev to prodThe promotion is virtual until apply: prod continues serving existing data until the new env is built.
Step 5 - Run scheduled execution
sqlmesh runRuns models per their cron schedule. Typically scheduled in CI/CD (daily / hourly), sqlmesh run checks each model and executes if its cron is due.
Step 6 - Audits
Audits are SQL-based data-quality checks attached to models:
-- audits/no_null_amounts.sql
AUDIT (
name no_null_amounts,
);
SELECT * FROM @this_model WHERE amount IS NULL;sqlmesh auditReturns failures if the audit query returns any rows.
Compare with great-expectations and soda-checks: SQLMesh audits are tightly coupled to SQLMesh models; GE/Soda are standalone data-quality frameworks. Choose audits when you're already in SQLMesh; GE/Soda for cross-framework data quality.
Step 7 - Tests (unit tests on models)
Unlike audits (which run on real data), tests run on synthetic input → synthetic output:
# tests/test_orders_summary.yaml
test_orders_summary:
model: sales.orders_summary
inputs:
sales.orders:
- { customer_id: 1, amount: 100.00 }
- { customer_id: 1, amount: 200.00 }
- { customer_id: 2, amount: 50.00 }
outputs:
query:
- { customer_id: 1, order_count: 2, total_amount: 300.00 }
- { customer_id: 2, order_count: 1, total_amount: 50.00 }sqlmesh testStep 8 - CI integration
- run: pip install sqlmesh
- run: sqlmesh test # unit tests
- run: sqlmesh plan ci-${{ github.run_id }} --no-prompts # build a per-PR env
- run: sqlmesh audit # data-quality checks
# Promotion to prod is a separate workflow with manual approval gateThe per-PR env approach gives full data-pipeline isolation: each PR materializes its own copy of the models, audits run against real PR data, and merge-then-promote is the production path.
Step 9 - Composition with sister tools
Pair with migration-blast-radius-reviewer for adversarial review of breaking changes - sqlmesh plan already classifies, but the reviewer adds estimation of downstream consumer impact (BI dashboards, downstream services) that SQLMesh's model graph alone doesn't capture.
For underlying schema migrations (DDL on the warehouse, separate from SQLMesh model changes), use flyway-migrations or atlas-migrations.
Anti-patterns
| Anti-pattern | Why it fails | Fix |
|---|---|---|
sqlmesh apply directly to prod without dev plan | No review of breaking-vs-non-breaking; broken pipelines | Always plan dev first (Step 4) |
Use kind FULL for everything | Full rebuilds expensive on large tables | Pick INCREMENTAL_BY_TIME_RANGE / BY_UNIQUE_KEY per model semantics |
| Skip audits on critical models | Data drift goes undetected | Audit every transformation (Step 6) |
| Write tests against production data instead of synthetic | Tests pass-by-accident; brittle | Use the inputs/outputs test format (Step 7) |
| Treat SQLMesh as schema-migration tool | SQLMesh manages data models, not raw DDL | Pair with Flyway/Atlas for raw schema (Step 9) |