Squash Migrations Using Alembic and Postgres

Published

Over time, a Python project accumulates many migrations using alembic. This can slow down tests (you need to run all migrations every time you create a test DB) and it’s very unlikely you will ever rollback past a certain point in time.

There’s no straightforward way of squashing migrations into one that I could find in alembic. I found that I couldn’t use the trick of running alembic revision --autogenerate because we don’t use ORM models to represent the entire schema. Instead, the combination of pg_dump and alembic stamp gives a similar result that allows us to squash everything into one migration.

Get the schema of DB as SQL

First we need to get the schema as a set of SQL statements using pg_dump. This needs to be done on the database that matches the actual schema in production (or maybe run it on production). Unless you are meticulous about using DB migrations, there might be some drift between the schema in your local database and what’s actually in production.

pg_dump -s --user {your app user} {name of db} > squash.sql

Verify and clean up

For my purposes, I removed the auto-generated SET statements, removed the auto-generated ownership statements (otherwise test DBs run using a different user would fail), and tried loading it in a clean DB to see if the tables match.

Create a migration and wrap the generated SQL statements

Generate a new migration that runs the generated pg_dump SQL.

The migration should look something like this:

# revision identifiers, used by Alembic.
revision = "123456789"
down_revision = None
branch_labels = None
depends_on = None


def upgrade():
    with open(f"{PROJECT_ROOT_DIR}/alembic/versions/squash.sql", "r") as f:
        for stmt in f.read().split(";\n"):
            if not stmt:
                continue
            op.execute(stmt)

Delete all previous migrations

You can now delete all of the old migrations and try out the new squashed migration to see that it takes a clean DB and produces the expected schema using alembic upgrade head.

Note: check very closely that the squashed migration actually gets your DB into the shape you want. For me, I needed to add some extra calls to pre-populate some data that were snuck in from previous migrations using handwritten INSERT statements.

Skip the migration in production

[WIP I haven’t verified this yet!] You don’t want to run the squashed migration in production. To skip it, run alembic stamp head which will mark the latest migration as the lastest version. New migrations can be applied on top safely using the usual alembic revision commands.