6

I've got a huge InnoDB table(>500 millions rows) which I'd like to partition by hash in order to decrease the index size. I'd like to achieve this with a minimal downtime(e.g 10 minutes is acceptable), what are the possible options?

I was thinking about something as follows:

  1. create the new partitioned table
  2. insert into this new table all the data from the old one using "insert ... select ..."
  3. make server unavailable for clients
  4. somehow sync changes which happened to the old table during step 2 with the new table
  5. replace the old table with the new one
  6. make server available for clients

The main question is what tool can be used in the step 4. The problem is that during step 2 there can be lots of changes to the original table: new inserts, updates, deletes - the sync tool should take all of this into account...

Another possible way, I believe, is:

  1. setup a replicating slave server
  2. sync this slave server with master
  3. switch master/slave roles and re-configure all clients to connect to the new master
  4. alter table on the previous master
  5. wait for master/slave synchronization
  6. switch master/slave roles again, re-configure all clients

Which one would you recommend?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
pachanga
  • 3,003
  • 4
  • 30
  • 45
  • both are good for this purpose. I, myself, follow first one, but we don't make site unavailable. The purpose of copying of a table is to stop querying it. – Nemoden Jul 26 '11 at 05:26
  • sometimes we even copy table with limits, e.g. `insert into table2 select * from table1 limit 0,100000` etc. – Nemoden Jul 26 '11 at 05:27
  • So you basically make the original table read-only, right? – pachanga Jul 26 '11 at 05:28
  • Both ways can work. If you have trouble finding out the changed data which have to be synced before you switch, then you might have less headache when using master/slave replica. That way you let MySQL take care of this. – Bjoern Jul 26 '11 at 05:29
  • right. And then I perform `sync - rename_first - rename_copy` in one transaction – Nemoden Jul 26 '11 at 05:30
  • @Nemoden, making the original table read-only won't work for me.... – pachanga Jul 26 '11 at 05:33
  • I'm sorry for a hurry answer to your question. We don't make the original read-only. We cope with `INSERTS` into the original table by copying it to temporary by parts (basically, we lock just a part of records at a time since the table is InnoDB, we can afford it). Even if the engine is `MyISAM`, and copying of one part takes several seconds, we can go with it, but we do it when loads are minimal. Once we've copied it, we perform `alterings` we need on copied table, so we are sure the original is safe and users still can access it. – Nemoden Jul 26 '11 at 05:39
  • After all, we make changes made to original table to copied. If you are sure that you don't delete from this table, you can select max(PK) from copied table (let's call it `MAX_ID`) and copy from original table to copied table `WHERE id > MAX_ID`, thus we assured that added records are inserted into our temporary table. – Nemoden Jul 26 '11 at 05:41
  • @Nemoden, thanks but it all sounds quite complicated to me :) – pachanga Jul 26 '11 at 06:14
  • @Nemoden, вот уж правда ;) still I think I'd go the replication route ;) – pachanga Jul 26 '11 at 06:28
  • Would work quite well too, I think. I didn't try it, but I think you even gave me an approach that could be more simple and reliable. – Nemoden Jul 26 '11 at 06:31
  • Check out [facebook's osc](https://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932) – Maxim Krizhanovsky Jul 26 '11 at 08:44
  • @Darhazer, thanks, I'll have a look – pachanga Jul 26 '11 at 11:09

1 Answers1

1

I would go with master/slave replication. If the master and slave can be on the same subnet, I would also add a new IP to the master, change the clients to point to the new IP. Then when you are about to switch to the slave, just:

  1. stop mysql on the master

  2. ifconfig down the extra IP on the master

  3. ifconfig up the extra IP on the new master

Clients will just connect to the new master without any client reconfig. Then you do the same thing when you switch back to the original master (if you switch back).

I recommend that a slave always be equivalent hardware to its master so that when it takes over for the master you don't find out it's so much slower that it can't keep up and your whole system fails. If you do that, then you need only switch once (from the current master to the new master).

Wayne Walker
  • 2,316
  • 3
  • 23
  • 25