98

I was just trying to add a column called "location" to a table (main_table) in a database. The command I run was

ALTER TABLE main_table ADD COLUMN location varchar (256);

The main_table contains > 2,000,000 rows. It keeps running for more than 2 hours and still not completed.

I tried to use mytop to monitor the activity of this database to make sure that the query is not locked by other querying process, but it seems not. Is it supposed to take that long time? Actually, I just rebooted the machine before running this command. Now this command is still running. I am not sure what to do.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
fanchyna
  • 2,623
  • 7
  • 36
  • 38

6 Answers6

221

Your ALTER TABLE statement implies mysql will have to re-write every single row of the table including the new column. Since you have more than 2 million rows, I would definitely expect it takes a significant amount of time, during which your server will likely be mostly IO-bound. You'd usually find it's more performant to do the following:

CREATE TABLE main_table_new LIKE main_table;
ALTER TABLE main_table_new ADD COLUMN location VARCHAR(256);
INSERT INTO main_table_new SELECT *, NULL FROM main_table;
RENAME TABLE main_table TO main_table_old, main_table_new TO main_table;
DROP TABLE main_table_old;

This way you add the column on the empty table, and basically write the data in that new table that you are sure no-one else will be looking at without locking as much resources.

Kannika
  • 2,538
  • 2
  • 27
  • 38
Romain
  • 12,679
  • 3
  • 41
  • 54
  • 1
    @Malvolio worth noting: I think it is a MySQL-proprietary SQL extension... Not 100% sure about that though. – Romain Sep 29 '11 at 15:41
  • 10
    even if it is, when was the last time anyone used a non-MySQL SQL RDBMS? Friends don't let friends buy Oracle. – Michael Lorton Sep 29 '11 at 17:32
  • Thank you for your reply. I followed your suggestion. But after I execute the INSERT INTO ... SELECT * FROM ...; command – fanchyna Sep 30 '11 at 00:26
  • Thank you for your reply. I followed your suggestion. But after I execute the INSERT INTO ... SELECT * FROM ...; command for over $4 hours, it is still not finished. I then open a new terminal and use SELECT COUNT(*) from main_table_new; You know what? I returns 0! It is not writing into the table! Why? – fanchyna Sep 30 '11 at 00:29
  • Could it because the temp of this DB is not big enough? – fanchyna Sep 30 '11 at 00:43
  • 1
    @fanchyna What's the engine for the table? If it's a transaction-aware one (e.g. InnoDB), `SELECT COUNT(*) FROM main_table_new;` will return 0 until the transaction of the `INSERT` is completed - depending on your transaction isolation (which I believe defaults to `READ COMMITED`) – Romain Sep 30 '11 at 08:23
  • 6
    Now filling in "fields_in_main_table" dynamically would be a great thing for use in a script. – Marki Oct 08 '12 at 15:24
  • How to add a lock (InnoDB and/or MyISAM)? What about adding this: http://stackoverflow.com/a/5616843/318765 – mgutt Feb 22 '13 at 17:11
  • I followed above steps but no luck... my table have 10+ billion records – ni3.net Oct 17 '15 at 11:20
  • 6
    Even this method though has to rewrite the entire table though. Am I correct in thinking that the only advantage of this approach is that it allows the current table to be used during the addition of the new column? – Michael Aug 08 '16 at 00:21
  • 1
    I recommend don't use this solution in real production databases with many write operations. Is clear that new table could have missing data, that arrive in the middle of the swap and long locks could happen too. – Raul R. Oct 25 '17 at 16:01
  • 14
    This won't help you if you have foreign keys referencing the table. – Jonny Oct 31 '17 at 08:17
  • 1
    You should drop indexes before the insert command and recreate them juste after – np42 Jun 09 '18 at 07:33
  • 1
    Will it take constraints,triggers and indexes to new table as well? – Muhammad Waheed Aug 29 '19 at 10:34
  • 1
    Why exactly does adding the column cause the whole server to be IO-bound? – Shardj Jan 29 '20 at 12:53
  • worked like a charm. I wonder if this could be just the defalt implementation for this function – shikida Jan 13 '22 at 16:11
40

I think the appropriate answer for this is using a feature like pt-online-schema-change or gh-ost.

We have done migration of over 4 billion rows with this, though it can take upto 10 days, with less than a minute of downtime.

Percona works in a very similar fashion as above

  • Create a temp table
  • Creates triggers on the first table (for inserts, updates, deletes) so that they are replicated to the temp table
  • In small batches, migrate data
  • When done, rename table to new table, and drop the other table
Pratik Bothra
  • 2,642
  • 2
  • 30
  • 44
  • 3
    This is the kind of question that evidence that, less voted or accepted solution sometimes are the best or only correct solutions. How many people vote for the other solution that is incorrect OMG. – Raul R. Nov 02 '17 at 18:00
  • @RaulR. It depends on the use case. The other solution was plenty "correct" in the use case I had, not everyone is working on a production environment 24/7. – Austin Schmidt Sep 24 '19 at 13:35
  • Beware that pt-online-schema-change might actually lead to blocking table. If your big table has fks pointing to it, then you can have blocking time or lose the fks. You need to chose a `alter-foreign-keys-method`, `rebuild_constraints` will require to build a fk anew, that will block the referencing tables. – Gabriel Furstenheim Jan 18 '21 at 13:05
7

You can speed up the process by temporarily turning off unique checks and foreign key checks. You can also change the algorithm that gets used.

If you want the new column to be at the end of the table, use algorithm=instant:

SET unique_checks = 0;
SET foreign_key_checks = 0;
ALTER TABLE main_table ADD location varchar(256), algorithm=instant;
SET unique_checks = 1;
SET foreign_key_checks = 1;

Otherwise, if you need the column to be in a specific location, use algorithm=inplace:

SET unique_checks = 0;
SET foreign_key_checks = 0;
ALTER TABLE main_table ADD location varchar(256) AFTER othercolumn, algorithm=inplace;
SET unique_checks = 1;
SET foreign_key_checks = 1;

For reference, it took my PC about 2 minutes to alter a table with 20 million rows using the inplace algorithm. If you're using a program like Workbench, then you may want to increase the default timeout period in your settings before starting the operation.

If you find that the operation is hanging indefinitely, then you may need to look through the list of processes and kill whatever process has a lock on the table. You can do that using these commands:

SHOW FULL PROCESSLIST;
KILL PROCESS_NUMBER_GOES_HERE;
Pikamander2
  • 7,332
  • 3
  • 48
  • 69
  • Even with these and on empty table it took forever. I was running it on mysql-8.0 docker container. What helped was to restart the database docker container. – MiroJanosik Mar 26 '23 at 15:44
1

Alter table takes a long time with a big data like in your case, so avoid to use it in such situations, and use some code like this one:

select main_table.*, 
  cast(null as varchar(256)) as null_location, -- any column you want accepts null
  cast('' as varchar(256)) as not_null_location, --any column doesn't accept null
  cast(0 as int) as not_null_int, -- int column doesn't accept null
into new_table 
from main_table;

drop table main_table;
rename table new_table TO main_table;
ZORRO_BLANCO
  • 849
  • 13
  • 25
0

Try to change table's engine from InnoDB to MyISAM, then revert it back to InnoDB and try.

ALTER TABLE `tablename` ENGINE = MyISAM ; 
ALTER TABLE `tablename` ENGINE = InnoDB;

Then run this

ALTER TABLE `tablename` ADD COLUMN `columnname` `datatype` null;
Abid Ali
  • 101
  • 1
  • 5
-4

DB2 z/OS does a virtual add of the column instantly. And puts the table into Advisory-Reorg status. Anything that runs before the reorg gets the default value or null if no default. When updates are done, they expand the rows updated. Inserts are done expanded. The next reorg expands every unexpanded row and assigns the default value to anything it expands.

Only a real database handles this well. DB2 z/OS.

  • 3
    You may want to consider that switching to a diffferent database will have its own bears on the road, and that this answer does not help the asker of the original question almost ten years down the line. – BertD Jun 16 '21 at 22:30