I have a Django project with a long, long, history. Its accumulated far more migrations
A word of caution:
As I write this I think... "I really hope there isn't some easier way that I just couldn't find".
Django has a nifty auto-managed
models.ManyToManyField table that does everything automatically for you. This includes ids, uniqueness, and probably other things. You never have to think about it. This is quite handy but also makes things a little trickier when you want to add some additional data to that table at a later date.
Recently I had exactly that issue, and not re-using this join table would have meant making another one exactly like it and maintaining the data twice. Not ideal.
Read about custom join tables in Django's docs. There are some drawbacks to be aware of, but I think its straightforward enough.
An overview of the 4 migrations needed to make this happen:
- Create the new
Model(to use as the new join 'through')
- Migrate the data from the old table to the new one
- Drop the old many to many from your model
- Add the new ManyToMany field back, with the same name
Simplified Base Models
We are modeling an app called
university, which has the concept of a
Building which can have many
Departments. Of course, one
Department can be in multiple
Buildings as well.
class Building(models.Model): name = CharField(max_length=50) departments = ManyToManyField(Product) class Department(models.Model): name = CharField(max_length=50) budget = IntegerField()
This results in a database table joining the two Models that looks like this (not real sql):
CREATE TABLE university_building_department ( id int not null primary key, building_id int not null fk('university_building_id'), department_id int not null fk('university_department_id') ) WITH CONSTRAINT ( unique over ['building_id', 'department_id'] )
1. Creating the new model
For the new model we are going to add some data to the
university_building_department so we can track who added it. You would have more interesting data!
class BuildingDepartment(models.Model): building = ForeignKey(Building, on_delete=models.CASCADE) department = ForeignKey(Department, on_delete=models.CASCADE) added_by = CharField(max_length=50, default='unknown') class Meta: # i like to specify my m-2-m join tables this way, with _x_ # note, I kept the app name. not required, but consistent table_name = "university_building_x_department" unique_together = [ ['building', 'department'] ]
Now that we have a new model, we need to add a migration for it. Do that by running:
(venv) [~/myapp] ./manage.py makemigrations Migrations for 'university': 0008_add_table_building_department.py: - Add model BuildingDepartment - Alter unique_together for buildingdepartment (1 constraint(s))
2. Migrating the data from the old table into the new one
I find it easiest to migrate the data using a
migrations.RunSQL rather than trying to futz with the
schema_editor. If you prefer to go that way, implement this in
RunPython. Of course, this is specific to your database, but I think this is pure ANSI SQL from probably before my time and it shouldn't need modification.
(venv) [~/myapp] ./manage.py makemigrations university --empty Migrations for 'university': 0009_auto_20160226_0919.py: (venv) [~/myapp] mv \ university/migrations/0009_auto_20160226_0919.py \ university/migrations/0009_migrate_join_table_data.py
Now that we have the (sensibly renamed) join table migration, open it up and change the contents to look somewhat like this migration. The point is to manually move the data from one table to the other, so we need the default values.
migrate_join_table_data = """ INSERT INTO university_building_x_department (building_id,department_id,added_by) SELECT building_id, department_id, 'unknown' FROM university_building_department """ class Migration(migrations.Migration): dependencies = [ ('university', '0008_add_table_building_department'), ] operations = [ migrations.RunSQL(migrate_join_table_data) ]
3. Remove the old ManyToManyField
Now we need to drop the old many-to-many field.
- If we did this first then Django would automatically delete the table we need for our source data in #2.
- We can't modify an automatic
ManyToManyfield. If you try you will get an error during
Simply update the Building class by commenting out the M2M field:
class Building(models.Model): name = CharField(max_length=50) # departments = ManyToManyField(Product)
(venv) [~/myapp] ./manage.py makemigrations Migrations for 'university': 0010_remove_building_departments.py: - Remove field departments from building
4. Add the many-to-may field back again, with 'to='
class Building(models.Model): name = CharField(max_length=50) departments = ManyToManyField(through='BuildingDepartment', to=Department)
And make the migrations!
(venv) [~/myapp] ./manage.py makemigrations Migrations for 'university': 0011_add_building_departments.py: - Add field departments to building
(venv)[~/myapp]: ./manage.py migrate Operations to perform: Synchronize unmigrated apps: django_fsm, corsheaders, ... Synchronizing apps without migrations: ... Running migrations: Rendering model states... DONE Applying university.0008_add_table_building_department... OK Applying university.0009_migrate_join_table_data... OK Applying university.0010_remove_building_departments... OK Applying university.0011_add_building_departments... OK
5. Using the new join
Because we are using a custom
through table, we can't use the default way of adding new records to the set. We need to do it manually. This means changes when using django rest framework, etc. This makes sense: we now have extra data, and we need to give values for that data.
with transaction.atomic(): stadium = Building.objects.create(name="Steve Holt Stadium") sports = Department.objects.create(name="Sports") bd = BuildingDepartment.objects.create( building = stadium, department = sports, added_by = 'george m.' )
Accessing the joined data
Now that we've created it, the standard methods like
.departments work as you would expect. However, there is another
set added to both sides:
b = Building.objects.get(name="Steve Holt Hall") print(b.departments.all()) > [Department("Sports"), ] print(b.buildingdepartment_set.all()) > [BuildingDepartment("Steve Holt Stadium", "Sports", "george m.")]
_set lets us get the actual joined records, where the normal
ManyToMany gets the actual related departments. This is quite important, since we definitely want to see the data that we added.