21

I am little confused with insert on duplicate update query. I have MySQL table with structure like this:

  • record_id (PRIMARY, UNIQUE)
  • person_id (UNIQUE)
  • some_text
  • some_other_text

I want to update some_text and some_other_text values for person if it's id exists in my table.person or insert new record in this table otherwise. How it can be done if person_id is not PRIMARY?

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
moonvader
  • 19,761
  • 18
  • 67
  • 116

7 Answers7

28

You need a query that check if exists any row with you record_id (or person_id). If exists update it, else insert new row

IF EXISTS (SELECT * FROM table.person WHERE record_id='SomeValue')
    UPDATE table.person 
    SET some_text='new_some_text', some_other_text='some_other_text' 
    WHERE record_id='old_record_id'
ELSE
    INSERT INTO table.person (record_id, person_id, some_text, some_other_text) 
    VALUES ('new_record_id', 'new_person_id', 'new_some_text', 'new_some_other_text')

Another better approach is

UPDATE table.person SET (...) WHERE person_id='SomeValue'
IF ROW_COUNT()=0
    INSERT INTO table.person (...) VALUES (...)
Lorenzo Vincenzi
  • 1,153
  • 1
  • 9
  • 26
  • [@@ROWCOUNT](https://msdn.microsoft.com/en-us/library/ms187316.aspx) or [ROW_COUNT()](https://dev.mysql.com/doc/refman/5.7/en/information-functions.html#function_row-count)? – wchiquito Nov 03 '15 at 21:10
  • @@ROW_COUNT() sorry. @@ROWCOUNT is for SQL Server @wchiquito – Lorenzo Vincenzi Nov 03 '15 at 21:26
  • @fattidare checking if exist is another working solution - it is ok, thank you – moonvader Nov 06 '15 at 07:51
  • Your second approach is an excellent idea. `INSERT ... ON DUPLICATE` **`PRIMARY`** `KEY UPDATE` is truly a missing feature in my experience. – Timo Oct 24 '17 at 10:27
  • 8
    As it turns out, **the conditional query will only work in a stored procedure**. I have found an alternative that works in regular queries and will post a separate answer here. – Timo Oct 26 '17 at 09:34
  • 11
    **Note** that `ROW_COUNT()` will return `0` if you try to update the post with the same values, causing duplicate posts. – Gjermund Dahl Nov 18 '17 at 13:47
  • is it just me or is this missing a `THEN` before the `UPDATE` statement? – blizz Jun 26 '22 at 21:07
12

Your question is very valid. This is a very common requirement. And most people get it wrong, due to what MySQL offers.

  • The requirement: Insert unless the PRIMARY key exists, otherwise update.
  • The common approach: ON DUPLICATE KEY UPDATE
  • The result of that approach, disturbingly: Insert unless the PRIMARY or any UNIQUE key exists, otherwise update!

What can go horribly wrong with ON DUPLICATE KEY UPDATE? You insert a supposedly new record, with a new PRIMARY key value (say a UUID), but you happen to have a duplicate value for its UNIQUE key.

What you want is a proper exception, indicating that you are trying to insert a duplicate into a UNIQUE column.

But what you get is an unwanted UPDATE! MySQL will take the conflicting record and start overwriting its values. If this happens unintentionally, you have mutilated an old record, and any incoming references to the old record are now referencing the new record. And since you probably won't tell the query to update the PRIMARY column, your new UUID is nowhere to be found. If you ever encounter this data, it will probably make no sense and you will have no idea where it came from.

We need a solution to actually insert unless the PRIMARY key exists, otherwise update.

We will use a query that consists of two statements:

  1. Update where the PRIMARY key value matches (affects 0 or 1 rows).
  2. Insert if the PRIMARY key value does not exist (inserts 1 or 0 rows).

This is the query:

UPDATE my_table SET
unique_name = 'one', update_datetime = NOW()
WHERE id = 1;

INSERT INTO my_table
SELECT 1, 'one', NOW()
FROM my_table
WHERE id = 1
HAVING COUNT(*) = 0;

Only one of these queries will have an effect. The UPDATE is easy. As for the INSERT: WHERE id = 1 results in a row if the id exists, or no row if it does not. HAVING COUNT(*) = 0 inverts that, resulting in a row if the id is new, or no row if it already exists.

I have explored other variants of the same idea, such as with a LEFT JOIN and WHERE, but they all looked more convoluted. Improvements are welcome.

Timo
  • 7,992
  • 4
  • 49
  • 67
  • I'm getting deadlocks on insert with this approach using MariaDB, two transactions racing for locks. – jarnoan Apr 16 '20 at 13:29
  • @jarnoan Two questions: (1) Are you submitting this as a single query or multiple? I recommend passing it as one. (2) Are you using a database transaction? If so, with which isolation level? – Timo Apr 16 '20 at 14:01
  • (1) Multiple, one for update and another for insert. (2) Yes, using transaction with default isolation level (that would be repeatable read for innodb, I suppose). I do have another where clause in update statement, and this is a Galera cluster (but writes using the same server instance), so that might cause something. – jarnoan Apr 20 '20 at 05:36
  • @jarnoan Galera cluster: bingo. Sending it as a single query should solve your problem. Sends it to the same node. If you **really** can't: This is a huge subject for a comment but here goes. **Always** set your cluster to have 1 write node. This prevents fail-on-commit (which is hell). Preferably, also enable causal reads (`wsrep_sync_wait`) for _all_ query types, including selects. Doing so enables "causal reads": if you write-then-read, it prevents you from seeing stale data, and ensures you get what you expect. (Because if your database is hard to reason about, you're in trouble.) – Timo Apr 20 '20 at 14:00
11

13.2.5.3 INSERT ... ON DUPLICATE KEY UPDATE Syntax

If you specify ON DUPLICATE KEY UPDATE, and a row is inserted that would cause a duplicate value in a UNIQUE index or PRIMARY KEY, MySQL performs an UPDATE of the old row.

Example:

DELIMITER //

DROP PROCEDURE IF EXISTS `sp_upsert`//
DROP TABLE IF EXISTS `table_test`//

CREATE TABLE `table_test` (
  `record_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `person_id` INT UNSIGNED NOT NULL,
  `some_text` VARCHAR(50),
  `some_other_text` VARCHAR(50),
  UNIQUE KEY `record_id_index` (`record_id`),
  UNIQUE KEY `person_id_index` (`person_id`)
)//

INSERT INTO `table_test`
  (`person_id`, `some_text`, `some_other_text`)
VALUES
  (1, 'AAA', 'XXX'),
  (2, 'BBB', 'YYY'),
  (3, 'CCC', 'ZZZ')//

CREATE PROCEDURE `sp_upsert`(
  `p_person_id` INT UNSIGNED,
  `p_some_text` VARCHAR(50),
  `p_some_other_text` VARCHAR(50)
)
BEGIN
  INSERT INTO `table_test`
    (`person_id`, `some_text`, `some_other_text`)
  VALUES
    (`p_person_id`, `p_some_text`, `p_some_other_text`)
  ON DUPLICATE KEY UPDATE `some_text` = `p_some_text`,
                          `some_other_text` = `p_some_other_text`;
END//

DELIMITER ;

mysql> CALL `sp_upsert`(1, 'update_text_0', 'update_text_1');
Query OK, 2 rows affected (0.00 sec)

mysql> SELECT
    ->   `record_id`,
    ->   `person_id`,
    ->   `some_text`,
    ->   `some_other_text`
    -> FROM
    ->   `table_test`;
+-----------+-----------+---------------+-----------------+
| record_id | person_id | some_text     | some_other_text |
+-----------+-----------+---------------+-----------------+
|         1 |         1 | update_text_0 | update_text_1   |
|         2 |         2 | BBB           | YYY             |
|         3 |         3 | CCC           | ZZZ             |
+-----------+-----------+---------------+-----------------+
3 rows in set (0.00 sec)

mysql> CALL `sp_upsert`(4, 'new_text_0', 'new_text_1');
Query OK, 1 row affected (0.00 sec)

mysql> SELECT
    ->   `record_id`,
    ->   `person_id`,
    ->   `some_text`,
    ->   `some_other_text`
    -> FROM
    ->   `table_test`;
+-----------+-----------+---------------+-----------------+
| record_id | person_id | some_text     | some_other_text |
+-----------+-----------+---------------+-----------------+
|         1 |         1 | update_text_0 | update_text_1   |
|         2 |         2 | BBB           | YYY             |
|         3 |         3 | CCC           | ZZZ             |
|         5 |         4 | new_text_0    | new_text_1      |
+-----------+-----------+---------------+-----------------+
4 rows in set (0.00 sec)

SQL Fiddle demo

wchiquito
  • 16,177
  • 2
  • 34
  • 45
  • I think this should be the correct answer. It is more efficient and it does fit into MySQL approved way to do it. – Reydel Leon Sep 21 '16 at 04:54
  • It's a very good answer, but only applies if there's a unique index present, which isn't always desirable. – mahemoff Apr 24 '18 at 02:47
  • @mahemoff you can also use `CREATE UNIQUE INDEX unique_idx ON table1(column1, column2, column3,...);` to specify multiple columns to be unique for the row, which means all values of `column1, column2, column3,...` must be same to be duplicate in this way. (Thus these columns can have same values to some extent.) And, if the row is duplicate, that triggers the update. – lechat Jan 08 '23 at 10:36
  • If same value as the ones specified by the `values` should be saved on the duplicates, this can also be related: https://stackoverflow.com/questions/14383503/on-duplicate-key-update-same-as-insert – lechat Jan 08 '23 at 10:43
2

How about my approach?

Let's say you have one table with a autoincrement id and three text-columns. You want to insert/update the value of column3 with the values in column1 and column2 being a (non unique) key.

I use this query (without explicitly locking the table):

insert into myTable (id, col1, col2, col3)
select tmp.id, 'col1data', 'col2data', 'col3data' from
(select id from myTable where col1 = 'col1data' and col2 = 'col2data' union select null as id limit 1) tmp
on duplicate key update col3 = values(col3)

Anything wrong with that? For me it works the way I want.

Chrisman
  • 109
  • 1
  • 8
1

A flexible solution should retain the atomicity offered by INSERT ... ON DUPLICATE KEY UPDATE and work regardless of if it's autocommit=true and not depend on a transaction with an isolation level of REPEATABLE READ or greater.

Any solution performing check-then-act across multiple statements would not satisfy this.

Here are the options:

If there tends to be more inserts than updates:

INSERT INTO table (record_id, ..., some_text, some_other_text) VALUES (...);

IF <duplicate entry for primary key error>
  UPDATE table SET some_text = ..., some_other_text = ... WHERE record_id = ...;

  IF affected-rows = 0
    -- retry from INSERT OR ignore this conflict and defer to the other session

If there tends to be more updates than inserts:

UPDATE table SET some_text = ..., some_other_text = ... WHERE record_id = ...;

IF affected-rows = 0
  INSERT INTO table (record_id, ..., some_text, some_other_text) VALUES (...);

  IF <duplicate entry for primary key error>
    -- retry from UPDATE OR ignore this conflict and defer to the other session

If you don't mind a bit of ugliness, you can actually use INSERT ... ON DUPLICATE KEY UPDATE and do this in a single statement:

INSERT INTO table (record_id, ..., some_text, some_other_text) VALUES (...)
    ON DUPLICATE KEY UPDATE
      some_text = if(record_id = VALUES(record_id), VALUES(some_text), some_text),
      some_other_text = if(record_id = VALUES(record_id), VALUES(some_other_text), some_other_text)

IF affected-rows = 0
  -- handle this as a unique check constraint violation

Note: affected-rows in these examples mean affected rows and not found rows. The two can be confused because a single parameter switches which of these values the client is returned.

Also note, if some_text and some_other_text are not actually modified (and the record is not otherwise changed) when you perform the update, those checks on affected-rows = 0 will misfire.

antak
  • 19,481
  • 9
  • 72
  • 80
1

I came across this post because I needed what's written in the title, and I found a pretty handy solution, but no one mentioned it here, so I thought of pasting it here. Note that this solution is very handy if you're initiating your database tables. In this case, when you create your corresponding table, define your primary key etc. as usual, and for the combination of columns you want to be unique, simply add

UNIQUE(column_name1,column_name2,...)

at the end of your CREATE TABLE statement, for any combination of the specified columns you want to be unique. Like this, according to this page here, "MySQL uses the combination of values in both column column_name1 and column_name2 to evaluate the uniqueness", and reports an error if you try to make an insert which already has the combination of values for column_name1 and column_name2 you provide in your insert. Combining this way of creating a database table with the corresponding INSERT ON DUPLICATE KEY syntax appeared to be the most suitable solution for me. Just need to think of it carefully before you actually start using your table; when setting up your database tables.

DevelJoe
  • 856
  • 1
  • 10
  • 24
0

For anyone else, like me, who is a DB noob....the above things didn't work for me. I have a primary key and a unique key... And I wanted to insert if unique key didn't exist. After a LOT of Stack Overflow and Google searching, I found not many results for this... but I did find a site that gave me a working answer: https://thispointer.com/insert-record-if-not-exists-in-mysql/

And for ease of reading here is my answer from that site:

INSERT INTO table (unique_key_column_name)
SELECT * FROM (SELECT 'unique_value' AS unique_key_column_name) AS temp
WHERE NOT EXISTS ( 
     SELECT unique_key_column_name FROM table
     WHERE unique_key_column_name = 'unique_value'
) LIMIT 1;

Please also note the ' marks are wrapped around for me because I use string in this case.

Dharman
  • 30,962
  • 25
  • 85
  • 135
G-Force
  • 345
  • 3
  • 10