35

Is there any way of performing in bulk a query like INSERT OR UPDATE on the MySQL server?

INSERT IGNORE ...

won't work, because if the field already exists, it will simply ignore it and not insert anything.

REPLACE ...

won't work, because if the field already exists, it will first DELETE it and then INSERT it again, rather than updating it.

INSERT ... ON DUPLICATE KEY UPDATE

will work, but it can't be used in bulk.

So I'd like to know if there's any command like INSERT ... ON DUPLICATE KEY UPDATE that can be issued in bulk (more than one row at the same time).

rid
  • 61,078
  • 31
  • 152
  • 193
  • 1
    Why do you say that INSERT ... ON DUPLICATE KEY UPDATE can’t be issued in bulk? – danorton Feb 28 '12 at 05:28
  • The question could be much clearer. Perhaps what the OP was looking for was a Bulk Update via SQL. That's what the accepted answer is. – Akrikos Apr 23 '12 at 20:47

5 Answers5

80

You can insert/update multiple rows using INSERT ... ON DUPLICATE KEY UPDATE. The documentation has the following example:

INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);

Or am I misunderstanding your question?

user359996
  • 5,533
  • 4
  • 33
  • 24
  • 20
    For anyone reading this, a clearer example is `INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6) ON DUPLICATE KEY UPDATE a=VALUES(a), b=VALUES(b), c=VALUES(c); ` so in this way you can update columns like updated_at with `VALUES (1,2,NOW())` using updated_at in place of c for example. – Zack Morris Nov 02 '16 at 19:17
  • hey guys, if the table has a auto_increment id, `INSERT ... ON DUPLICATE KEY UPDATE` will make id be a un-continuously incrementing serial number, how did you fix it? – Zander Wong May 08 '19 at 18:06
  • @ZanderWong I don't think this is a problem with MySQL 5.6. See [this example](http://sqlfiddle.com/#!9/665a516/1) – yyFred Mar 10 '21 at 08:33
  • 1
    @ZanderWong This is still a problem, even in MySQL 8.x -- the "ON DUPLICATE KEY" statement increases the auto-increment value in InnoDB tables, even if a new row is not inserted. See the [MySQL manual](https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html) Unfortunately, there's no real solution for this issue. You need to be aware of it and make sure your auto-increment column is large enough. You can burn through a lot of values very quickly, if you are doing bulk updates using the "ON DUPLICATE KEY" statement. – Jeff Kilbride Jul 09 '21 at 16:07
  • @ZanderWong set innodb_autoinc_lock_mode to 2 instead of the default value 1. MySQL docs: https://dev.mysql.com/doc/refman/5.6/en/optimizing-innodb-bulk-data-loading.html – Tiago B Apr 13 '22 at 20:07
20

One possible way to do this is to create a temporary table, insert the data into that, and then do 1 query with a join to insert the records that don't exist followed by and update to the fields that do exist. The basics would be something like this.

CREATE TABLE MyTable_Temp LIKE MyTable

LOAD DATA INFILE..... INTO MyTable_Temp

UPDATE MyTable INNER JOIN 
MyTable_Temp
ON MyTable.ID=MyTable_Temp.ID
SET MyTable.Col1=MyTable_Temp.Col1, MyTable.Col2=MyTable_Temp.Col2.....

INSERT INTO MyTable(ID,Col1,Col2,...)
SELECT ID,Col1,Col2,... 
FROM MyTable_Temp
LEFT JOIN MyTable 
ON MyTable_Temp.ID = MyTable.ID
WHERE myTable.ID IS NULL

DROP TABLE MyTable_Temp

The syntax may not be exact, but this should give you the basics. Also, I know it's not pretty, but it gets the job done.

Update

I swapped the order of the insert and update, because doing insert first causes all the inserted rows to be updated when the update is called. If you do update first, only the existing records are updated. This should mean a little less work for the server, although the results should be the same.

Kibbee
  • 65,369
  • 27
  • 142
  • 182
  • What if two threads try to update the table simultaneously by incrementing a key's value? It could happen that a key's value won't be updated because it wasn't existent in the UPDATE step and was inserted by another thread before performing the INSERT step. – Ben Aug 02 '12 at 14:16
  • @Ben You're right. This really does depend on the use case, including how the table gets updated, how many update threads are running, and whether or not those update threads will update the same data within the table. If you have multiple threads running, all updating the same data, it might be worth it to lock the target table prior to starting the update, and then unlock once the insert is completed. A lot depends on precisely what you are trying to accomplish, and the method described above works in a lot of instances, where you never update the table at all except from a bulk CSV file. – Kibbee Aug 02 '12 at 14:31
  • 1
    A small enhancement: if you do this inside a transaction, you can change it to `CREATE TEMPORARY TABLE MyTable_Temp like MyTable`, which will be destroyed automatically once you commit the transaction (and is only visible inside this transaction). Otherwise, if you're doing multiple bulk updates on the same table (even ones known for other reasons to be disjoint), you could run into race conditions. – Lucas Wiman Oct 17 '12 at 22:55
  • Be careful with this method since if you are on a shared server chances are good the Query will time out if the size of the tables is moderately large. The left join was not intended to be used like this. – captainspi Jul 07 '13 at 09:02
4

Although this question has been answered correctly already (that MySQL does support this via ON DUPLICATE UPDATE with the expected multiple value set syntax), I'd like to expand on this by providing a demonstration that anyone with MySQL can run:

CREATE SCHEMA IF NOT EXISTS `test`;
DROP TABLE IF EXISTS test.new_table;
CREATE TABLE test.new_table (`Key` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`Key`)) ENGINE=InnoDB AUTO_INCREMENT=106 DEFAULT CHARSET=utf8;

SELECT * FROM test.new_table;
INSERT INTO test.new_table VALUES (1),(2),(3),(4),(5) ON DUPLICATE KEY UPDATE `Key`=`Key`+100;
SELECT * FROM test.new_table;
INSERT INTO test.new_table VALUES (1),(2),(3),(4),(5) ON DUPLICATE KEY UPDATE `Key`=`Key`+100;
SELECT * FROM test.new_table;

The output is as follows:

Empty set (0.00 sec)

Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

+-----+
| Key |
+-----+
|   1 |
|   2 |
|   3 |
|   4 |
|   5 |
+-----+
5 rows in set (0.00 sec)

Query OK, 10 rows affected (0.00 sec)
Records: 5  Duplicates: 5  Warnings: 0

+-----+
| Key |
+-----+
| 101 |
| 102 |
| 103 |
| 104 |
| 105 |
+-----+
5 rows in set (0.00 sec)
Christopher McGowan
  • 1,351
  • 10
  • 10
0

Try adding an insert trigger that does a pre-flight check and cancels the insert on duplicate key (after updating the existing row).

Not sure it'll scale well for bulk inserts, let alone work for load data infile, but it's the best I can think of. :-)

Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
0

If you were using Oracle or Microsoft SQL, you could use the MERGE. However, MySQL does not have a direct correlation to that statement. There is the single-row solution that you mentioned but, as you pointed out, it doesn't do bulk very well. Here is a blog post I found on the difference between Oracle and MySQL and how to do what Oracle does with MERGE in MySQL:

http://blog.mclaughlinsoftware.com/2009/05/25/mysql-merge-gone-awry/

It isn't a pretty solution and it probably isn't as full a solution as you would like, but I believe that is the best there is for a solution.

IAmTimCorey
  • 16,412
  • 5
  • 39
  • 75