1

I want to write a new migration file to change old bad data to the better one. For example, the old model was like this:

from django.db import models

class Person(models.Model):
    fullname = models.CharField(max_length=250, null=True)
    information = models.CharField(max_length=350, null=True)

and the value of fullname is sth like:

first_name:George;last_name:Adam Pince Green

which first_name and last_name are always in the same order. The value of information is like:

id_code:0021678913;born_in:Canada;birth_year:1975

or

birth_year:1990;born_in:Portugal;id_code:0219206431

which are not ordered. now I need to write a migration file that split this fullname and information values to the new model like:

from django.db import models


class Person(models.Model):
    first_name = models.CharField(max_length=30, null=True)
    last_name = models.CharField(max_length=50, null=True)
    id_code = models.CharField(max_length=10, null=True)
    born_in = models.CharField(max_length=30, null=True)
    birth_year = models.PositiveSmallIntegerField(null=True)

2 Answers2

0

You can alter the new fields, then use django.db.migrations.operations.RunPython to extract values from old fields and save to the new ones. At the end delete old fields. https://docs.djangoproject.com/en/3.1/ref/migration-operations/#django.db.migrations.operations.RunPython

PolishCoder
  • 170
  • 1
  • 1
  • 8
  • Thank you for your answer. But my main problem is how to split the values of the old field and assign them to the new model fields – Sajad Saedi Mar 09 '21 at 20:34
0

for your code you can write something like:

from django.db import migrations, models

def ammend_the_data(apps, _):
    Person = apps.get_model('people', 'Person')
    people = Person.objects.all().iterator()

    for person in people:
        fullnamesplit = person.fullname.split(';')
        first_name = fullnamesplit[0].split(':')[1]
        last_name = fullnamesplit[1].split(':')[1]

        informationsplit = person.information.split(';')
        info2 = {}
        for info in informationsplit:
            info3 = info.split(':')
            info2.update({info3[0] : info3[1]})
        
        person.first_name = first_name
        person.last_name = last_name
        person.id_code = info2.get('id_code')
        person.born_in = info2.get('born_in')
        person.birth_year = int(info2.get('birth_year'))

        person.save()



class Migration(migrations.Migration):
    initial = True

    dependencies = [
    ]

    operations = [
        migrations.CreateModel(
            name='Person',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('fullname', models.CharField(max_length=250, null=True)),
                ('information', models.CharField(max_length=350, null=True)),
                
                ('first_name', models.CharField(max_length=30, null=True)),
                ('last_name', models.CharField(max_length=50, null=True)),
                ('id_code', models.CharField(max_length=10, null=True)),
                ('born_in', models.CharField(max_length=30, null=True)),
                ('birth_year', models.PositiveSmallIntegerField(null=True)),
            ],
        ), 
        
        migrations.RunPython(
            code= ammend_the_data)
    ]

As you can see for the first name and last name we can use indexing to find the proper strings. But for splitting information we must use a dictionary since we don't know the order of the data entered. But obviously you can use dictionary for the first part too.

The important part was the person.save() line. You don't have to treat Django CharFields any differently. You can treat them as strings. Just don't forget to call model_instance.save() after you have made your changes.

Also as you can see it is better to use QuerySet Iterator to iterate through the data points, since iterating through big data sets will be a problem if you don't use it. For more information for when to use iterator() please refer here: iterator()

Mohammad Alavi
  • 912
  • 1
  • 6
  • 16