January 9, 2020

Rebuilding Django Migrations

Rebuilding Django Migrations

I have a Django project with a long, long, history.  Its accumulated far more migrations than I care to admit (or maintain.)  Applying migrations to a new database takes 30+ seconds on good hardware. With a dozen+ apps, all the depends_on spaghetti you would expect exists, with custom code migrations, model renaming, everything.

There is no practical way to "squash" these, so some surgery is necessary.

Complicated dependencies, wipe 'em all out!

  1. Delete all existing migrations & create new ones
  2. Apply migrations to a new database to generate the django_migrations table
  3. Create an update script (SQL) from that migrations table
  4. Apply that to the existing database(s)
  5. Deploy the new code!

Steps 3 & 4 are a bit unfortunate, but I couldn't find a way around it.  The migrations generated by #1 are not simple, with multiple dependencies between apps.  Django does a lot of checking as well when applying migrations, so we can't just delete our custom apps from the migrations table.  In the end, deleting everything and re-inserting from a 'clean' copy was the most repeatable and clear.

1. Create a new database

Edit your DATABASE_URL for the local dev environment to use a new database name.

Create the new database, but don't apply migrations.  Its necessary to do this first, since the makemigrations step requires a database to exist.

$> ./manage.py reset_db

2. Delete your existing migrations & create new ones

‌‌Delete all the migrations/* files from the apps (except __init__.py).  A new find command will be needed if you have more than 999 migrations in a folder.  If there are then I bet you've already done something like this yourself.

$> cd project/apps/
$> find . -type f -path '*migrations/0*.py' -exec rm {} \;

Create the new migrations. There may be more than 1 migration file created for each app, not just 0001_initial.py, due to dependencies.  Thanks to Django for working all this out!

$> ./manage.py makemigrations
$> ./manage.py migrate

3. Create an insert script

The following function will generate a SQL script for use with another django console.  In your ./manage.py shell , while connected to the new database, copy & paste this function and copy the output.

Depending on your database you will need to generate slightly different commands.  This example uses PostgreSQL and the clock_timestamp() function to insert the applied column of the django_migrations table.  For MySQL you would replace that with now().

def get_migrations_insert():
    from django.db import connection
    with connection.cursor() as cursor:
        cursor.execute(
            "SELECT app,name FROM django_migrations ORDER BY id"
        )
        values = [
            "('%s', '%s', clock_timestamp())" % row
            for row in cursor.fetchall()
        ]
        return "INSERT INTO django_migrations (app, name, applied) " \
               "VALUES %s" % ", ".join(values)
               
>>> get_migrations_insert() 
"INSERT INTO django_migrations (app, name, applied) VALUES 
    ('app1', '0001_initial', clock_timestamp()), 
    ('contenttypes', '0001_initial', clock_timestamp()) ..."

Copy that output, and we are done with the new database.  Delete it, or not, its up to you.

4. Rebuild the migrations in the old database

‌‌Now we need to remove all existing migrations data and replace it.  This can be done either in a Django shell, or by executing directly against the database.

  1. Delete all existing migrations
  2. Insert the copy of the migrations table

The shell method will work even if you don't have direct access to the database, or have no psql or mysql cli available on the box you will run the update from.

# start a shell, and copy this function.
def rebuild_migrations(script):
    from django.db import connection
    from datetime import datetime
    with connection.cursor() as c:
        c.execute("delete from django_migrations")
        c.execute(script)  

>>> rebuild_migrations("{copy the big insert statement}")

5. Deploy the new code!

This is best done on a boundary, when no other changes are happening, of course.  It might be a good idea to save the old migrations table, and try this locally first :)

This process worked for me, after years of struggling with trying to squash migrations the legitimate way.