0

What is the best practice to update 2 million rows data in MySQL? As update by primary id, I have to update 1 by 1. It is so slow. Like below

UPDATE table SET col1=value1 WHERE id=1;
UPDATE table SET col1=value2 WHERE id=2;
UPDATE table SET col1=value3 WHERE id=3;

UPDATE table SET col1=valueN WHERE id=N;

2 Answers2

0

Assuming table is innodb (look at SHOW CREATE TABLE table output) one of the reasons its slow is an insufficient innodb_buffer_pool_size.

I also assume its in the default auto_commit=1 mode meaning each SQL line is a single transaction.

Its generally best to avoid processes that need to change every row in a table.

You could split this into a number of threads doing updates and that should get though the list quicker. If the table is MyISAM it won't scale this way.

danblack
  • 12,130
  • 2
  • 22
  • 41
0

Good way to update many rows in one query with INSERT statement with condition ON DUPLICATE KEY UPDATE. The statement will update the old row if there is a duplicate value in a UNIQUE index or PRIMARY KEY. See documentation.

INSERT INTO table (id, col1, col2, col3) 
VALUES (%s, %s, %s, %s)
ON DUPLICATE KEY UPDATE 
    col1 = VALUES(col1), 
    col2 = VALUES(col2), 
    col3 = VALUES(col3); 

To update really big amount of date like 2 millions rows try to separate you data to several queries by several thousand rows.

Also notice optimization tips: Optimizing INSERT Statements and Bulk Data Loading for InnoDB Tables

Peter
  • 1,223
  • 3
  • 11
  • 22
  • This seems like a good solution, but if you have an auto_increment column (id) you might quickly reach its upper limit, because the storage engine increases the auto_increment even on "ON DUPLICATE" value found - "The storage engine must increment the AUTO_INCREMENT value before inserting a row. It does not know if the insert will fail yet at that point. " - more here https://stackoverflow.com/questions/7087869/mysql-insert-on-duplicate-update-adds-one-to-the-autoincrement – MartinG Mar 16 '22 at 00:33