2

We have a very large (10million+) row table stored in MySql using the InnoDB engine. Column 'x' is defined as 'smallint(5) unsigned not null'.

Requirements have now changed since the original design a few years ago, and column 'x' needs to store a minimum datatype size of 'int unsigned not null'.

We are allowed a "short" downtime to deploy the application (less than 5 mins) so any database change would need to fit within this time window if it requires the database table to be made temporarily unavailable in any way (e.g. full table lock). If the change can be done "online" then we can probably accept degraded performance for a longer period.

Has anyone here had experience of altering column type in MySql/InnoDB on a very large table? If so, did it go OK and roughly how long did it take (I realise this is hardware dependent, I am just trying to understand if what we are asking to do in the time window is even vaguely possible)?

Thanks in advance,

Scruffers
  • 4,984
  • 8
  • 35
  • 41
  • How long did ALTER TABLE take when you applied it to a similarly sized table on production-grade hardware in your test lab? – MarkR May 31 '11 at 19:18
  • @MarkR - in progress but may take a while for environment to be set up as out of my teams control. Hence trying to get an idea of feasibility in meantime... – Scruffers May 31 '11 at 19:48

4 Answers4

1

heres a recipe i've used

where you have old column, a

  1. create a new column b
  2. create a trigger to update b on update/insert on a
  3. update b = a
  4. drop all fkey relations referencing a
  5. create fkey relations for b
  6. update code to use column b instead of a (deploy)
  7. drop triggers on column a
  8. drop column a

repeat all steps if you must change the column name back.

David Chan
  • 7,347
  • 1
  • 28
  • 49
0

You could do it online by creating a new table as a clone of the original's layout.

CREATE TABLE newtable LIKE oldtable;

Next, alter your new table's columns to meet the new spec

ALTER TABLE newtable ... ;

Once that's done, copy the data.

INSERT INTO newtable SELECT * FROM oldtable;

Finally, use your few minutes downtime to rename your old table to something else, then give your new table the old table's name.

Of course, you need to have enough storage for two copies of your table available. Also, you might want to disable the indexes on the new table when doing the copy to reduce stress on the server during the copy operation.

Once you're confident that all the data in the old table has been copied to the new without any lossage you can finally delete the old table entirely.

EDIT:

Actually, a better approach might be to just add a new column to your existing table that meets the new requirements for the column you want to update, copy the values of the old column to the new column, drop the old column and rename the new column to the old column's name. It would certainly put lower demands on your storage.

Also, if you have any FK constraints that depend on the column in question, you will need to remove them before starting and reinstate them on the new column once you've done.

GordonM
  • 31,179
  • 15
  • 87
  • 129
  • and what with the data that get added to the table / modified during the copy operation? – fvu May 31 '11 at 19:08
  • thanks, but I share concern of @fvu as this table is updated by every transaction hitting the app servers (i.e. many many times a minute) – Scruffers May 31 '11 at 19:44
  • Okay, that's a good point, didn't think of that. I suppose you could implement a trigger to keep the two columns / tables synched until you're ready to switch over. – GordonM Jun 01 '11 at 05:34
0

What kind of high-availability solution do you currently have in place?

I ask because, 5 minutes is not enough downtime to allow a reboot for normal tasks such as OS updates.

You must therefore, have some kind of high-availability solution in place to allow these regular (I assume your operations team continues to apply patches from time to time) updates.

It should be possible to use such a system to do this.

However, ALTER TABLE allows the table to remain available for read operations throughout its run, and only blocks reads for a short time at the end (much less than five minutes on most systems).

So ask, what time can you reasonably block writes for?

Also, 10M rows is not a large table, by any stretch of the imagination. It probably fits in ram hence will be very quick to alter.

MarkR
  • 62,604
  • 14
  • 116
  • 151
  • For HA: Two stateless app-servers behind load balancer mean that they can be taken down independently for OS patching etc. However, both app servers share the same DB for state which is what am concerned about as we can't block writes to this table at all when app servers are up and if they are down, that is where the 5 mins (ish) window comes in. – Scruffers May 31 '11 at 19:40
  • On size: Good to know 10M is not "big", we're conducting some performance tests of this so will be interesting to see if this turns out to be a non-issue... – Scruffers May 31 '11 at 19:42
0

This approach is pretty quick

Query OK, 10000000 rows affected (6 min 0.14 sec)

mysql slow query

Community
  • 1
  • 1
Jon Black
  • 16,223
  • 5
  • 43
  • 42