43

I have a query with the following structure:

SELECT ..... WHERE status = 'QUEUED' ORDER BY position ASC LIMIT 1 FOR UPDATE;

It's a single-table SELECT statement on InnoDB table. Field position (INT NOT NULL) has an index on it. status is ENUM and is also indexed.

SELECT ... FOR UPDATE manual page says, that it locks all rows it reads. Do I understand correctly, that in this case only one row will be locked? Or rather it will lock the whole table?

Is that possible to determine which rows will be locked with EXPLAIN query? If yes - how? Explain for a query on the empty table shows the following:

1;'SIMPLE';'job';'index';<null>;'index_position';[34,...];<null>;1;'Using where'
Vladislav Rastrusny
  • 29,378
  • 23
  • 95
  • 156

4 Answers4

35

This is a great question. InnoDB is a row level locking engine, but it has to set additional locks to ensure safety with the binary log (used for replication; point in time recovery). To start explaining it, consider the following (naive) example:

session1> START TRANSACTION;
session1> DELETE FROM users WHERE is_deleted = 1; # 1 row matches (user_id 10), deleted.
session2> START TRANSACTION;
session2> UPDATE users SET is_deleted = 1 WHERE user_id = 5; # 1 row matches.
session2> COMMIT;
session1> COMMIT;

Because statements are only written to the binary log once committed, on the slave session#2 would apply first, and would produce a different result, leading to data corruption.

So what InnoDB does, is sets additional locks. If is_deleted is indexed, then before session1 commits nobody else will be able to modify or insert into the range of records where is_deleted=1. If there are no indexes on is_deleted, then InnoDB needs to lock every row in the entire table to make sure the replay is in the same order. You can think of this as locking the gap, which is different concept to grasp from row-level locking directly.

In your case with that ORDER BY position ASC, InnoDB needs to make sure that no new rows could be modified between the lowest key value and a "special" lowest possible value. If you did something like ORDER BY position DESC.. well, then nobody could insert into this range.

So here comes the solution:

  • Statement based binary logging sucks. I really look forward to a future where we all switch to row based binary logging (available from MySQL 5.1, but not on by default).

  • With Row-based replication, if you change the isolation level to read-committed, then only the one row that matches needs to be locked.

  • If you want to be a masochist, you can also turn on innodb_locks_unsafe_for_binlog with statement-based replication.


Update 22 April: To copy + paste my improved version of your testcase (it was not searching 'in the gap'):

session1> CREATE TABLE test (id int not null primary key auto_increment, data1 int, data2 int, INDEX(data1)) engine=innodb;
Query OK, 0 rows affected (0.00 sec)

session1> INSERT INTO test VALUES (NULL, 1, 2), (NULL, 2, 1), (5, 2, 2), (6, 3, 3), (3, 3, 4), (4, 4, 3);
Query OK, 6 rows affected (0.00 sec)
Records: 6  Duplicates: 0  Warnings: 0

session1> start transaction;
Query OK, 0 rows affected (0.00 sec)

session1> SELECT id FROM test ORDER BY data1 LIMIT 1 FOR UPDATE;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

session2> INSERT INTO test values (NULL, 0, 99); # blocks - 0 is in the gap between the lowest value found (1) and the "special" lowest value.

# At the same time, from information_schema:

localhost information_schema> select * from innodb_locks\G
*************************** 1. row ***************************
    lock_id: 151A1C:1735:4:2
lock_trx_id: 151A1C
  lock_mode: X,GAP
  lock_type: RECORD
 lock_table: `so5694658`.`test`
 lock_index: `data1`
 lock_space: 1735
  lock_page: 4
   lock_rec: 2
  lock_data: 1, 1
*************************** 2. row ***************************
    lock_id: 151A1A:1735:4:2
lock_trx_id: 151A1A
  lock_mode: X
  lock_type: RECORD
 lock_table: `so5694658`.`test`
 lock_index: `data1`
 lock_space: 1735
  lock_page: 4
   lock_rec: 2
  lock_data: 1, 1
2 rows in set (0.00 sec)

# Another example:
select * from test where id < 1 for update; # blocks
Morgan Tocker
  • 3,370
  • 25
  • 36
  • What if I don't use any replication at all? And did I get you right about of ORDER BY is used, the whole index range gets locked and therefore the whole table is locked as well? – Vladislav Rastrusny Apr 18 '11 at 15:09
  • 2
    If you don't use replication, or the binary log for point-in-time-recovery, then set innodb_locks_unsafe_for_binlog (point in time recovery is so good though - I don't know why you wouldn't). It depends on the ORDER BY as to how much of the table is locked (from an internal perspective it's never a table lock, but individual locks could be set on each row if you don't have an index). – Morgan Tocker Apr 18 '11 at 15:13
  • 1
    I should note, that you need the new InnoDB Locks - http://dev.mysql.com/doc/refman/5.5/en/innodb-locks-table.html information_schema tables to debug what locking is happening. – Morgan Tocker Apr 18 '11 at 15:14
  • @Morgan Tocker, well, looks like you were wrong. Please see my answer. – Vladislav Rastrusny Apr 22 '11 at 17:49
  • Or may be it's version specific? – Vladislav Rastrusny Apr 22 '11 at 19:55
  • Why is it wrong? Do I miss something? The question is about how much rows are locked by ORDER BY clause. I don't see why we need to allow INSERT into our use case. MySQL can lock the gap, but still it's only one row + gap before/after it. – Vladislav Rastrusny Apr 23 '11 at 09:25
  • You asked how many rows, and I said it's not a question of row locks, but gap locks. Yes - it is one row + the gap. It doesn't have to be an insert - I show a SELECT just below the insert. Your second SELECT doesn't search 'in the gap'. – Morgan Tocker Apr 23 '11 at 13:01
6

I've made tests. Created the following table:

id  data1   data2
1   1   2
2   2   1
5   2   2
6   3   3
3   3   4
4   4   3

Then I created first connection with transaction:

SELECT id FROM test ORDER BY data1 LIMIT 1 FOR UPDATE;

result was the row with id=1;

Then I created second transaction from another connection without commiting first:

SELECT id FROM test WHERE data1=2 FOR UPDATE;

It didn't block. And it blocked only when I tried to select the very row selected by the first transaction. I tried the following with changing ORDER BY to DESC one, it works also.

Conclusion: MySQL blocks only the rows it actually selected when using ORDER BY and LIMIT clauses. See @Morgan answer for gap locking explanation.

My MySQL version is 5.0.45

Vladislav Rastrusny
  • 29,378
  • 23
  • 95
  • 156
  • 1
    Did you nest your SELECT statements in a transaction? - I repeated your testcase, and I can produce the result I was explaining. – Morgan Tocker Apr 23 '11 at 00:59
2

There is a bug in some versions of MySQL: #67745 Too much row locks when using SELECT for UPDATE, LIMIT and ORDER BY.

Version: 5.5.28, 5.5.30, 5.7.1

Same bug on my local mysql 5.5.25 win64.

2

Unlike other databases, in MySQL the query will lock the index positions. This effectively means that all rows that currently have status equal to 'QUEUED' or would like it to have changed to 'QUEUED' from another transaction are locked. The only solution I've found to this is selecting the rows without FOR UPDATE, then selecting them with a ID-based filter and re-checking the condition once they are locked. Not nice, but it does the job.

Lukáš Lalinský
  • 40,587
  • 6
  • 104
  • 126
  • See my comment. Workaround is possible (row based replication). – Morgan Tocker Apr 23 '11 at 13:07
  • Don't bother with the `SELECT`. Instead, just do `UPDATE table SET filter_column=my_process_id WHERE status='QUEUED' LIMIT X`, then `SELECT` everything that got `filter_column` set to `my_process_id` in a transaction. That way, you get the minimum possible lock-time on any part of the table/index/whatever. – Christopher Schultz Apr 28 '17 at 20:36