15

I've found a very confusing deadlock situation that I need help to understand.

There are two transactions going on:
(2) holds a lock for the query delete from myTable where id = NAME_CONST('p_id',10000). This is a lock by PRIMARY KEY although not the full key but a range. It looks like this is a full write lock to me when it says lock_mode X locks rec but not gap.
(1) is waiting for this same lock, also for the query delete from myTable where id = NAME_CONST('p_id',10000).
(2) is also trying go get this lock and MySQL detects a deadlock.

What I can't understand is why (2) has to acquire the lock again as it already holds it and it's a write lock (lock_mode X) in all cases.

It also looks like it's for the exact same query.

Here is the table definition

create myTable (
  id int unsigned not null,
  value1 char(8) not null,
  value2 int unsigned,
  primary key (id, value1)
);

and here is the information from SHOW ENGINE INNODB STATUS\G

------------------------
LATEST DETECTED DEADLOCK
------------------------
130313 14:46:28
*** (1) TRANSACTION:
TRANSACTION 75ACB8A3, ACTIVE 0 sec, process no 6110, OS thread id 139973945382656 starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 5154970, query id 5201313618 192.168.0.2 user updating
delete from myTable where id = NAME_CONST('p_id',10000)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 22371 page no 1598 n bits 104 index `PRIMARY` of table `db`.`myTable` trx id 75ACB8A3 lock_mode X waiting
Record lock, heap no 32 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 0005af3a; asc :;;
1: len 8; hex 2020202020202020; asc ;;
2: len 6; hex 000075acb890; asc u ;;
3: len 7; hex ea0000020d011e; asc ;;
4: len 4; hex 00000065; asc e;;

*** (2) TRANSACTION:
TRANSACTION 75ACB890, ACTIVE 0 sec, process no 6110, OS thread id 139973957895936 starting index read
mysql tables in use 1, locked 1
7 lock struct(s), hea
p size 1248, 6 row lock(s), undo log entries 4
MySQL thread id 5155967, query id 5201313625 192.168.0.1 user updating
delete from myTable where id = NAME_CONST('p_id',10000)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 22371 page no 1598 n bits 104 index `PRIMARY` of table `db`.`myTable` trx id 75ACB890 lock_mode X locks rec but not gap
Record lock, heap no 32 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 0005af3a; asc :;;
1: len 8; hex 2020202020202020; asc ;;
2: len 6; hex 000075acb890; asc u ;;
3: len 7; hex ea0000020d011e; asc ;;
4: len 4; hex 00000065; asc e;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 22371 page no 1598 n bits 104 index `PRIMARY` of table `db`.`myTable` trx id 75ACB890 lock_mode X waiting
Record lock, heap no 32 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
0: len 4; hex 0005af3a; asc :;;
1: len 8; hex 2020202020202020; asc ;;
2: len 6; hex 000075acb890; asc u ;;
3: len 7; hex ea0000020d011e; asc ;;
4: len 4; hex 00000065; asc e;;

*** WE ROLL BACK TRANSACTION (1)
Morteza Adi
  • 2,413
  • 2
  • 22
  • 37
Andreas Wederbrand
  • 38,065
  • 11
  • 68
  • 78
  • Did you tried to reproduce error? If "yes" could you show us scenario? – ravnur Mar 14 '13 at 15:43
  • I've tried to do this delete as fast as I can with `start transaction; delete...; rollback;` from two simulations threads as fast as bash can feed it to mysql but not once did I get a deadlock. I'm quite clueless as to how this can happen. – Andreas Wederbrand Mar 14 '13 at 20:26
  • See please this question (it describes how you can catch deadlock): http://stackoverflow.com/questions/2143873/how-to-explain-the-deadlock-better. Also there is quite nice article about detecting and recovery from deadlock: http://dwachira.hubpages.com/hub/Process-Deadlock-Definition-Prevention-Detection-Recovery-and-Avoidance. In general, deadlock is the issue of architecture and design. You should review your processes that updates data. – ravnur Mar 15 '13 at 06:52
  • Perhaps I wasn't clear enough. I already know a lot about deadlocks. What they, how they appear and how to prevent them. My problem this time is that I can't understand why transaction (2) needs to take the lock it already holds. It's perhaps more a question about innoDB than about deadlocks in general. – Andreas Wederbrand Mar 15 '13 at 09:13
  • What is the result of `show variables like '%autocommit%'`on production database? – ravnur Mar 15 '13 at 09:28
  • And you've say ` I can't understand why transaction (2) needs to take the lock it already holds`. Transaction (1) and (2) are different. Take a look on `transaction id` and `mysql thread id`. They are different, not the same! – ravnur Mar 15 '13 at 09:32
  • Of course they are different. But why does (2) need to take the lock it already holds? I understand why (1) needs it and why it have to wait for (2) to release it. – Andreas Wederbrand Mar 15 '13 at 12:00
  • I'm very sorry but I completely can't understand why you are thinking that transaction (2) tries to acquire lock which it already holds. – ravnur Mar 15 '13 at 12:32
  • I might be missing something but it looks like it says `*** (2) WAITING FOR THIS LOCK TO BE GRANTED:` and that the lock it's waiting for is exactly the same as the one it's holding and the one (1) is also waiting for. – Andreas Wederbrand Mar 15 '13 at 12:40

1 Answers1

29

It's not the same lock - the lock transaction 1 has is on the (index) record only and not the gap lock.

Here's what's happening:

  1. Transaction 2 gets a lock for the (index) record but not the gap before the record ('rec but not gap'), i.e. it has a record lock only.
  2. Transaction 1 tries to get a lock on the record and the gap before (i.e. a next-key lock), but can't because transaction 2 has a record lock (and so transaction 1 waits).
  3. Transaction 2 tries to get a lock on the record and the gap before (i.e. a next-key lock) and can't because Transaction 1 is already waiting for the same lock and is ahead of it in the queue.
  4. Deadlock.

I'm not entirely sure why Transaction 2 doesn't acquire a next-key lock immediately - perhaps the process of obtaining the record lock and then the gap lock isn't atomic (in the general sense of the word).

I think the issue is that you have a composite primary key (id, value1) but you are deleting from a range (specifying id only) - this requires gap locks. See http://dev.mysql.com/doc/refman/5.0/en/innodb-record-level-locks.html, in particular:

Gap locking is not needed for statements that lock rows using a unique index to search for a unique row. (This does not include the case that the search condition includes only some columns of a multiple-column unique index; in that case, gap locking does occur.)

Can you change your code so you specify the full primary key when deleting, i.e. id and value1?

Other options:

  • Retry the delete when there's a deadlock, e.g. catch the error in your code and then retry if it was caused by a deadlock. This approach is often easier said than done, especially in legacy applications, but it is recommended by the MySQL page on how to cope with deadlocks:

Always be prepared to re-issue a transaction if it fails due to deadlock. Deadlocks are not dangerous. Just try again.

  • Lock the whole table with a table-level lock before issuing the delete statement. This may affect performance though and is a 'sledgehammer' approach.
Martin Wilson
  • 3,386
  • 1
  • 24
  • 29
  • It might be possible if I add an inner select of all possible keys inside an IN-statement or a join. It feels weird that the single delete-statement should a) not be atomic and b) not fast enough that any other delete actually gets locked by it. – Andreas Wederbrand Mar 19 '13 at 13:26
  • How often do you get these deadlocks? Is anything else happening in the transaction before the delete? How many concurrent threads are running the deletes? Could you post the code? – Martin Wilson Mar 19 '13 at 18:14
  • It happens a couple of times each day, always from two different machines. Nothing happens before the delete that relates to that table. Other things happen to other tables but that shouldn't matter. I'm not able to post the unscrambled code. – Andreas Wederbrand Mar 19 '13 at 19:22
  • No pointers on why the delete statement doesn't take the complete lock immediately? That part still confuses me. – Andreas Wederbrand Mar 23 '13 at 20:47
  • They are two separate locks (lock on the row and then a gap lock) so I guess the second translation can nip in and request both in between the first transaction gaining the row lock and being granted the gap lock. This is a guess, although this page http://dev.mysql.com/doc/refman/5.0/en/innodb-deadlocks.html implies this may be possible: "You can get deadlocks even in the case of transactions that just insert or delete a single row. That is because these operations are not really “atomic”; they automatically set locks on the (possibly several) index records of the row inserted or deleted." – Martin Wilson Mar 23 '13 at 21:05
  • 2
    I'll reward the bounty for pointing out where the manual says that it can happen but unfortunately I still don't understand why this happens and it goes against everything I've ever learned about transactions and deadlocks. – Andreas Wederbrand Mar 25 '13 at 20:33