2

My original problem was, given a db table with 60M rows I need to convert a field type from boolean to integer field. I thought of creating a custom django migration for this (please do let me know if you have a better approach than this) which looks like this-

def make_changes(apps, schema_editor):
    vcs_model = apps.get_model('iot_app', 'AbstractVCSCompartmentData')
    vcs_model.objects.select_related('vcsdata').all().update(charging_status_v2=F('charging_status'))
    vcs_model.objects.select_related('vcsdata').all().update(charging_status_backup=F('charging_status'))  # backup




class Migration(migrations.Migration):
    dependencies = [
        ('iot_app', '0030_auto_20220225_1027.py'),
    ]

    operations = [

        migrations.AddField(  # backup field
            model_name='AbstractVCSCompartmentData',
            name='charging_status_backup',
            field=models.PositiveIntegerField(blank=True, null=True),
        ),


        migrations.AddField(
            model_name='AbstractVCSCompartmentData',
            name='charging_status_v2',
            field=models.PositiveIntegerField(blank=True, null=True),
        ),

        migrations.RunPython(make_changes),

        migrations.RemoveField(
            model_name='AbstractVCSCompartmentData',
            name='charging_status',
        ),
        migrations.RenameField(
            model_name='AbstractVCSCompartmentData',
            old_name='charging_status_v2',
            new_name='charging_status',
        ),
    ]


I want to unroll all the changes i.e., making my custom migration reversible. I have gone through RunPython doc. But I am confused as in how can i perform addition of a new field in my reverse_code() function.

The idea of creating a backup field is to reinstate the db to its previous state.

Is this is the correct approach, given I have 60M rows in that table? wouldn't it lock the table for the time being? I want to do this as efficient as possible

raj-kapil
  • 177
  • 1
  • 15
  • What behaviour are you expecting when you reverse this migration? Converting a boolean to an integer is trivial, but the other way around not so much. What should happen with values greater than 1? – Nico Griffioen Apr 13 '22 at 09:49
  • ```field1``` is the original field and the idea is - Step 1: Create two new fields, ```field_v2``` and ```field_backup``` which would be an integer field and a boolean field Step 2: fill the values of ```field1``` into ```field_v2``` and ```field_backup``` Step 3: Now delete the ```field1``` Step 4: rename the ```field_v2``` to ```field``` So now If I need to revert this migrations, i got to Delete the ```field_v2``` rename ```field_backup``` to ```field``` – raj-kapil Apr 13 '22 at 10:26
  • in this case, you just copy the data from `field_backup` to `field`. Because `field` is already there. You just need to provide a backwards function to your `RunPython` statement – FabianClemenz Apr 13 '22 at 10:56

1 Answers1

2

i edited my answer after your comments - at the moment when RunPython is executed in reversion step, charging_status has been renamed to charging_status_v2 and the old charging_status field is present. So you can copy directly from backup to charging_status

def make_changes(apps, schema_editor):
    vcs_model = apps.get_model('iot_app', 'AbstractVCSCompartmentData')
    vcs_model.objects.select_related('vcsdata').all().update(charging_status_v2=F('charging_status'))
    vcs_model.objects.select_related('vcsdata').all().update(charging_status_backup=F('charging_status'))  # backup


def backwards(apps, schema_editor):
    vcs_model = apps.get_model('iot_app', 'AbstractVCSCompartmentData')
    vcs_model.objects.select_related('vcsdata').all().update(charging_status=F('charging_status_backup'))



class Migration(migrations.Migration):
    dependencies = [
        ('iot_app', '0030_auto_20220225_1027.py'),
    ]

    operations = [

        migrations.AddField(  # backup field
            model_name='AbstractVCSCompartmentData',
            name='charging_status_backup',
            field=models.PositiveIntegerField(blank=True, null=True),
        ),


        migrations.AddField(
            model_name='AbstractVCSCompartmentData',
            name='charging_status_v2',
            field=models.PositiveIntegerField(blank=True, null=True),
        ),

        migrations.RunPython(make_changes, backwards),

        migrations.RemoveField(
            model_name='AbstractVCSCompartmentData',
            name='charging_status',
        ),
        migrations.RenameField(
            model_name='AbstractVCSCompartmentData',
            old_name='charging_status_v2',
            new_name='charging_status',
        ),
    ]
FabianClemenz
  • 303
  • 3
  • 8
  • I am concerned about how this will affect the db, will this create any downtime. If I do this will the table be locked during these operation? Could you please explain me a little bit on the db level, how this will play out? – raj-kapil Apr 13 '22 at 11:04
  • 1
    afaik - it will not create any downtime - we run huge migrations - but always have maintenance mode enabled (would always be best i think). I don't have experience with migrating databases during heavy workloads. What you could do is a table lock in every function - something like this: https://stackoverflow.com/questions/19686204/django-orm-and-locking-table – FabianClemenz Apr 13 '22 at 11:16
  • ```vcs_model.objects.select_related('vcsdata').all().update(charging_status_v2=F('charging_status'))```I want to do this without locking the table and the update here will lock the table i think. – raj-kapil Apr 13 '22 at 12:18
  • Then you need to iterate over the objects and set it directly. – FabianClemenz Apr 14 '22 at 13:08