0

I have a medium-traffic website with a MySQL database, and I'm seeing occasional Duplicate entry errors when 2+ concurrent requests try to update the same row.

We use Perl/DBI to access the database.

Perl'ish pseudo-code:

$dbh->begin_work;

my $row = $dbh->selectrow_hashref( "SELECT * FROM mytable WHERE id=$some_id" );

if ( defined($row) ) {

   # ... do stuff; uses $row ...

   $dbh->do( "UPDATE mytable SET ... WHERE id=$some_id" );

}
else {

  # ... do other stuff, different from above ...

  $dbh->do( "INSERT INTO mytable SET id=$some_id, ... " );

  sleep 30; # added for emphasis
}

$dbh->commit;

The id column is unique, obviously.

To repeat/rephrase the issue, suppose request #1 comes along. Row is inserted. While SLEEPing, request #2 comes along; $row is undef because we still haven't committed request #1, so we try to INSERT again, and get the Duplicate entry error.

I understand why this is happening -- because we're not locking. This is the background. The question is how to implement a lock given this background.

Unfortunately, INSERT INTO ... ON DUPLICATE KEY UPDATE... doesn't work since we're doing slightly different things depending on existence of $row.

I looked into SELECT ... FOR UPDATE and SELECT ... LOCK IN SHARE MODE as described here:

MySQL InnoDB: Difference Between `FOR UPDATE` and `LOCK IN SHARE MODE`

but because we're inserting a new row during request #1, there is no row to lock prior to the insert, that would lock out request #2.

After reading the above link and other resources on the web, I don't really know what to try next that would work reliably, without deadlocks and other scary things like that.

Ideas? Help? Thanks!

yahermann
  • 1,539
  • 1
  • 12
  • 33
  • You don’t lock the database so what do you expect? – Boris the Spider Mar 11 '20 at 19:48
  • That’s the reason I’m asking the question :)... how to best go about locking that row. – yahermann Mar 11 '20 at 19:53
  • "...so we try to INSERT again, and get the Duplicate entry error." -- That's right, that's what it's supposed to do. What were you expecting? – The Impaler Mar 11 '20 at 21:46
  • Shouldn't the "id" be produced by the `INSERT` operation itself? Why are you producing an id outside the database? Seems artificial. – The Impaler Mar 11 '20 at 21:47
  • `id` is not produced by `INSERT`. it is a uuid created by the application. – yahermann Mar 11 '20 at 21:50
  • I've edited the question to make it clear that I do understand why this is happening, and that I'm seeking a solution. – yahermann Mar 11 '20 at 21:52
  • 2
    it seems that `select for update` solves this problem as it claims to prevent inserting as well: https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html#innodb-record-locks. Does `select for update` block `select for update` with the same id in your test? – Khanh TO Mar 14 '20 at 04:14
  • `SELECT ... FOR UPDATE` should work in this case and block the second connection. Did you try it, or just assumed, that it wouldn't work? – Paul Spiegel Mar 14 '20 at 11:56
  • If whatever you are doing is idempotent, is it possible to have some UNIQUE Key index on the other key column and then keep an INSERT IGNORE clause? So this way you will not insert extra. – techno Mar 14 '20 at 21:45
  • Or how about this, create a stored procedure which will execute the insert or update in SERIALIZABLE transaction isolation? That way the code would look like if { ... } else { ... } $dbh -> do("Call UPSERT_MYTABLE(@id := $some_id, @x ... )") – techno Mar 14 '20 at 22:06
  • @KhanhTO you are absolutely correct. It works, I just tested it. I guess I didn't give MySQL credit for being that smart, so I just didn't believe it would work. Write it up as a proper answer and collect your bounty! :) – yahermann Mar 15 '20 at 00:31

1 Answers1

1

As specified here: https://dev.mysql.com/doc/refman/8.0/en/innodb-locking.html#innodb-record-locks

select for update should solve this problem as it claims to prevent inserting as well, the first request with select for update should block the second request select for update with the same id.

Another requirement is that the isolation level has to be at least read committed

Khanh TO
  • 48,509
  • 13
  • 99
  • 115
  • Thanks again. I ran an actual test, using code similar to the question, and can confirm `SELECT ... FOR UPDATE` does lock read after an uncommitted `INSERT` having same id. – yahermann Mar 15 '20 at 03:31
  • @yahermann: What i mean with `select for update` blocks `select for update` with the same id: Request #1 `select for update`, Request#1 sleeps, Request#2 `select for update`, Request#2 blocks. It does not need to execute insert, can you confirm? – Khanh TO Mar 15 '20 at 03:50
  • @yahermann: another requirement is that the isolation level has to be at least read committed – Khanh TO Mar 15 '20 at 04:00
  • 1
    That is exactly what happens. The `select for update` in the 2nd process waits/blocks for those 30 seconds, until the transaction in 1st process commits, then the 2nd process reads the row inserted by the 1st process. – yahermann Mar 15 '20 at 04:10
  • @yahermann: Thanks, I'm now confident with the answer. It would be very appreciated if you mark the answer as accepted. – Khanh TO Mar 15 '20 at 04:13