Writing database migrations
This document explains how to structure and write database migrations for different scenarios you might encounter. For introductory material on migrations, see the topic guide.
Data migrations and multiple databases
When using multiple databases, you may need to figure out whether or not to run a migration against a particular database. For example, you may want to only run a migration on a particular database.
In order to do that you can check the database connection’s alias inside a RunPython
operation by looking at the schema_editor.connection.alias
attribute:
from django.db import migrations
def forwards(apps, schema_editor):
if schema_editor.connection.alias != 'default':
return
# Your migration code goes here
class Migration(migrations.Migration):
dependencies = [
# Dependencies to other migrations
]
operations = [
migrations.RunPython(forwards),
]
You can also provide hints that will be passed to the allow_migrate()
method of database routers as **hints
:
class MyRouter:
def allow_migrate(self, db, app_label, model_name=None, **hints):
if 'target_db' in hints:
return db == hints['target_db']
return True
Then, to leverage this in your migrations, do the following:
from django.db import migrations
def forwards(apps, schema_editor):
# Your migration code goes here
...
class Migration(migrations.Migration):
dependencies = [
# Dependencies to other migrations
]
operations = [
migrations.RunPython(forwards, hints={'target_db': 'default'}),
]
If your RunPython
or RunSQL
operation only affects one model, it’s good practice to pass model_name
as a hint to make it as transparent as possible to the router. This is especially important for reusable and third-party apps.
Migrations that add unique fields
Applying a “plain” migration that adds a unique non-nullable field to a table with existing rows will raise an error because the value used to populate existing rows is generated only once, thus breaking the unique constraint.
Therefore, the following steps should be taken. In this example, we’ll add a non-nullable UUIDField
with a default value. Modify the respective field according to your needs.
Add the field on your model with
default=uuid.uuid4
andunique=True
arguments (choose an appropriate default for the type of the field you’re adding).Run the
makemigrations
command. This should generate a migration with anAddField
operation.Generate two empty migration files for the same app by running
makemigrations myapp --empty
twice. We’ve renamed the migration files to give them meaningful names in the examples below.Copy the
AddField
operation from the auto-generated migration (the first of the three new files) to the last migration, changeAddField
toAlterField
, and add imports ofuuid
andmodels
. For example:# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations, models
import uuid
class Migration(migrations.Migration):
dependencies = [
('myapp', '0005_populate_uuid_values'),
]
operations = [
migrations.AlterField(
model_name='mymodel',
name='uuid',
field=models.UUIDField(default=uuid.uuid4, unique=True),
),
]
Edit the first migration file. The generated migration class should look similar to this:
class Migration(migrations.Migration):
dependencies = [
('myapp', '0003_auto_20150129_1705'),
]
operations = [
migrations.AddField(
model_name='mymodel',
name='uuid',
field=models.UUIDField(default=uuid.uuid4, unique=True),
),
]
Change
unique=True
tonull=True
– this will create the intermediary null field and defer creating the unique constraint until we’ve populated unique values on all the rows.In the first empty migration file, add a
RunPython
orRunSQL
operation to generate a unique value (UUID in the example) for each existing row. Also add an import ofuuid
. For example:# Generated by Django A.B on YYYY-MM-DD HH:MM
from django.db import migrations
import uuid
def gen_uuid(apps, schema_editor):
MyModel = apps.get_model('myapp', 'MyModel')
for row in MyModel.objects.all():
row.uuid = uuid.uuid4()
row.save(update_fields=['uuid'])
class Migration(migrations.Migration):
dependencies = [
('myapp', '0004_add_uuid_field'),
]
operations = [
# omit reverse_code=... if you don't want the migration to be reversible.
migrations.RunPython(gen_uuid, reverse_code=migrations.RunPython.noop),
]
Now you can apply the migrations as usual with the
migrate
command.Note there is a race condition if you allow objects to be created while this migration is running. Objects created after the
AddField
and beforeRunPython
will have their originaluuid
’s overwritten.
Non-atomic migrations
On databases that support DDL transactions (SQLite and PostgreSQL), migrations will run inside a transaction by default. For use cases such as performing data migrations on large tables, you may want to prevent a migration from running in a transaction by setting the atomic
attribute to False
:
from django.db import migrations
class Migration(migrations.Migration):
atomic = False
Within such a migration, all operations are run without a transaction. It’s possible to execute parts of the migration inside a transaction using atomic()
or by passing atomic=True
to RunPython
.
Here’s an example of a non-atomic data migration that updates a large table in smaller batches:
import uuid
from django.db import migrations, transaction
def gen_uuid(apps, schema_editor):
MyModel = apps.get_model('myapp', 'MyModel')
while MyModel.objects.filter(uuid__isnull=True).exists():
with transaction.atomic():
for row in MyModel.objects.filter(uuid__isnull=True)[:1000]:
row.uuid = uuid.uuid4()
row.save()
class Migration(migrations.Migration):
atomic = False
operations = [
migrations.RunPython(gen_uuid),
]
The atomic
attribute doesn’t have an effect on databases that don’t support DDL transactions (e.g. MySQL, Oracle). (MySQL’s atomic DDL statement support refers to individual statements rather than multiple statements wrapped in a transaction that can be rolled back.)
Controlling the order of migrations
Django determines the order in which migrations should be applied not by the filename of each migration, but by building a graph using two properties on the Migration
class: dependencies
and run_before
.
If you’ve used the makemigrations
command you’ve probably already seen dependencies
in action because auto-created migrations have this defined as part of their creation process.
The dependencies
property is declared like this:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('myapp', '0123_the_previous_migration'),
]
Usually this will be enough, but from time to time you may need to ensure that your migration runs before other migrations. This is useful, for example, to make third-party apps’ migrations run after your AUTH_USER_MODEL
replacement.
To achieve this, place all migrations that should depend on yours in the run_before
attribute on your Migration
class:
class Migration(migrations.Migration):
...
run_before = [
('third_party_app', '0001_do_awesome'),
]
Prefer using dependencies
over run_before
when possible. You should only use run_before
if it is undesirable or impractical to specify dependencies
in the migration which you want to run after the one you are writing.
Migrating data between third-party apps
You can use a data migration to move data from one third-party application to another.
If you plan to remove the old app later, you’ll need to set the dependencies
property based on whether or not the old app is installed. Otherwise, you’ll have missing dependencies once you uninstall the old app. Similarly, you’ll need to catch LookupError
in the apps.get_model()
call that retrieves models from the old app. This approach allows you to deploy your project anywhere without first installing and then uninstalling the old app.
Here’s a sample migration:
myapp/migrations/0124_move_old_app_to_new_app.py
from django.apps import apps as global_apps
from django.db import migrations
def forwards(apps, schema_editor):
try:
OldModel = apps.get_model('old_app', 'OldModel')
except LookupError:
# The old app isn't installed.
return
NewModel = apps.get_model('new_app', 'NewModel')
NewModel.objects.bulk_create(
NewModel(new_attribute=old_object.old_attribute)
for old_object in OldModel.objects.all()
)
class Migration(migrations.Migration):
operations = [
migrations.RunPython(forwards, migrations.RunPython.noop),
]
dependencies = [
('myapp', '0123_the_previous_migration'),
('new_app', '0001_initial'),
]
if global_apps.is_installed('old_app'):
dependencies.append(('old_app', '0001_initial'))
Also consider what you want to happen when the migration is unapplied. You could either do nothing (as in the example above) or remove some or all of the data from the new application. Adjust the second argument of the RunPython
operation accordingly.
Changing a ManyToManyField
to use a through
model
If you change a ManyToManyField
to use a through
model, the default migration will delete the existing table and create a new one, losing the existing relations. To avoid this, you can use SeparateDatabaseAndState
to rename the existing table to the new table name whilst telling the migration autodetector that the new model has been created. You can check the existing table name through sqlmigrate
or dbshell
. You can check the new table name with the through model’s _meta.db_table
property. Your new through
model should use the same names for the ForeignKey
s as Django did. Also if it needs any extra fields, they should be added in operations after SeparateDatabaseAndState
.
For example, if we had a Book
model with a ManyToManyField
linking to Author
, we could add a through model AuthorBook
with a new field is_primary
, like so:
from django.db import migrations, models
import django.db.models.deletion
class Migration(migrations.Migration):
dependencies = [
('core', '0001_initial'),
]
operations = [
migrations.SeparateDatabaseAndState(
database_operations=[
# Old table name from checking with sqlmigrate, new table
# name from AuthorBook._meta.db_table.
migrations.RunSQL(
sql='ALTER TABLE core_book_authors RENAME TO core_authorbook',
reverse_sql='ALTER TABLE core_authorbook RENAME TO core_book_authors',
),
],
state_operations=[
migrations.CreateModel(
name='AuthorBook',
fields=[
(
'id',
models.AutoField(
auto_created=True,
primary_key=True,
serialize=False,
verbose_name='ID',
),
),
(
'author',
models.ForeignKey(
on_delete=django.db.models.deletion.DO_NOTHING,
to='core.Author',
),
),
(
'book',
models.ForeignKey(
on_delete=django.db.models.deletion.DO_NOTHING,
to='core.Book',
),
),
],
),
migrations.AlterField(
model_name='book',
name='authors',
field=models.ManyToManyField(
to='core.Author',
through='core.AuthorBook',
),
),
],
),
migrations.AddField(
model_name='authorbook',
name='is_primary',
field=models.BooleanField(default=False),
),
]
Changing an unmanaged model to managed
If you want to change an unmanaged model (managed=False
) to managed, you must remove managed=False
and generate a migration before making other schema-related changes to the model, since schema changes that appear in the migration that contains the operation to change Meta.managed
may not be applied.