2

I have a MySQL DB with multiple tables that each have almost 2M records. When I run a migration that updates those tables, it's very slow. I tried beefing up the server to 64 CPU and 240GB RAM (probably really unnecessary) but it's still taking too long.

Is there any way to speed up the migrations?

EDIT: As pointed out by @iklinac, I saw this post about creating a new temporary table and then migrating the data instead from the old table to this new table. Is there a "Django way" to do this?

Jude Maranga
  • 865
  • 2
  • 9
  • 27
  • The question is, *how* do you update these tables. Without any knowledge what migrations you do, there is not much one can do. – Willem Van Onsem May 08 '20 at 16:25
  • @WillemVanOnsem What do you mean by "without any knowledge what migrations you do?" Do you mean like I don't know what kind of updates the migrations will do? – Jude Maranga May 08 '20 at 16:28
  • no, what I mean is that you do *not* share the migration. If you for example perform a data migration through looping, then it will indeed take years, but one can boost that with `.update(..)`. But you simply say "I have a migration", so we can not offer much help, with this specific migration. – Willem Van Onsem May 08 '20 at 16:30
  • @WillemVanOnsem ooh I see. The migrations I have are all just autogenerated migrations from Django from `manage.py makemigrations` which adds and alter fields of the tables. I didn't do any manual updates to the migration files. Does that help? – Jude Maranga May 08 '20 at 16:32
  • alter field migrations just do altering query on database, if you are adding or editing fields it is common for database to take long time as it needs to rewrite all rows. These operations are I/O bound mostly – iklinac May 08 '20 at 16:35
  • @iklinac I was thinking that if you could convert the Django migrations to native SQL commands and run it directly, it'll be quicker? Is that possible and would it indeed be quicker if there was a way to do that? – Jude Maranga May 08 '20 at 16:37
  • 1
    @JudeMaranga: the migrations are transformed to SQL queries. You inspect these with `manage.py sqlmigrate appname migrationname`. It will print the SQL queries it will do on the database. – Willem Van Onsem May 08 '20 at 16:47
  • As my comment above implies slowness is not on Django as it just generates SQL statement for altering table ( in your words native SQL). Database does a lot of I/O operations to accommodate change when there is a lot data involved. – iklinac May 08 '20 at 16:48
  • Does this answer your question? [ALTER TABLE ADD COLUMN takes a long time](https://stackoverflow.com/questions/7599519/alter-table-add-column-takes-a-long-time) – iklinac May 08 '20 at 16:49
  • @iklinac But why don't I see any huge spikes in the Disk I/O of my server? – Jude Maranga May 08 '20 at 16:50
  • We still don't know exact operation you are doing so exactly what database does is not what we can determine, also please don't edit questions anymore to ask different question in them, rather ask new question – iklinac May 08 '20 at 17:07
  • Additional information request. Any SSD or NVME devices on MySQL Host server? Post on pastebin.com and share the links. From your SSH login root, Text results of: B) SHOW GLOBAL STATUS; after minimum 24 hours UPTIME C) SHOW GLOBAL VARIABLES; D) SHOW FULL PROCESSLIST; E) complete MySQLTuner report AND Optional very helpful information, if available includes - htop OR top for most active apps, ulimit -a for a Linux/Unix list of limits, iostat -xm 5 3 for IOPS by device and core/cpu count, for server workload tuning analysis to provide suggestions. – Wilson Hauck May 09 '20 at 22:28

1 Answers1

3

You could run SQL queries in Django migrations by writing migrations or editing migration file yourself using RunSQL operations

class RunSQL(sql, reverse_sql=None, state_operations=None, hints=None, elidable=False)

Allows running of arbitrary SQL on the database - useful for more advanced features of database backends that Django doesn’t support directly.

With state_operations argument

state_operations argument allows you to supply operations that are equivalent to the SQL in terms of project state. For example, if you are manually creating a column, you should pass in a list containing an AddField operation here so that the autodetector still has an up-to-date state of the model. If you don’t, when you next run makemigrations, it won’t see any operation that adds that field and so will try to run it again. For example:

migrations.RunSQL(
    "ALTER TABLE musician ADD COLUMN name varchar(255) NOT NULL;",
    state_operations=[
        migrations.AddField(
            'musician',
            'name',
            models.CharField(max_length=255),
        ),
    ],
)
iklinac
  • 14,944
  • 4
  • 28
  • 30