13

Does the SQL standard specify the locking order for a multi-table query?

For example, given:

SELECT department.id FROM permissions, terminals, departments WHERE 
       department.id = ? AND 
       terminal.id = ? AND 
       permissions.parent = department.id AND 
       permissions.child = terminals.id;
  1. Does the SQL standard guarantee a locking order or is it determined by the (implementation-specific) execution plan?
  2. Is there a way to guarantee a locking order?
  3. If there is no way to guarantee locking order, how are we supposed to prevent deadlocks?

UPDATE: Please do not vote to close this issue without explaining your reasoning. As far as I'm concerned, this is a programming question, which makes it very much on-topic for Stackoverflow. If you believe the question needs to be further refined, please explain and I will be more than happy to answer you.

Masoud
  • 8,020
  • 12
  • 62
  • 123
Gili
  • 86,244
  • 97
  • 390
  • 689
  • 1
    SELECT queries do not generate locks that result in deadlocks. Can you rephrase your question so it is relevant to a real situation? – Gordon Linoff May 14 '13 at 20:30
  • 1
    @GordonLinoff, SELECT queries in READ_COMMITTED do generate locks (but for the duration of the statement). For other isolation levels (such as REPEATABLE_READ or SERIALIZABLE) they retain a lock until the end of the transaction. Caveat: some databases use MVCC which does not use any locks, but these are out of scope for this question. – Gili May 14 '13 at 20:34
  • 1
    Locks are an implementation detail. The isolation levels just specify phenomena that can/not occur. In SQL Server select queries at read committed mostly take `S` locks that are released as soon as the data is read (before the end the statement). Sometimes these locks can be kept until the statement ends however [example](http://blogs.msdn.com/b/craigfr/archive/2007/05/31/read-committed-and-large-objects.aspx) and other times it doesn't take row level `S` locks at all. And given point 1 points 2 and 3 are unanswerable unless you specify a particular RDBMS. – Martin Smith May 14 '13 at 20:43
  • Queries in READ_COMMITTED isolation level use a shared read lock. I expect that databases can acquire all the required locks in a single atomic operation. Commit may or may not prevent reads - I would expect most enterprise databases to be smart enough to avoid this – Sami Korhonen May 14 '13 at 20:53
  • @MartinSmith, please post an answer so we can elaborate on your points in more details. – Gili May 14 '13 at 21:03
  • So your question is about rdbms not using mvcc. But shouldn't the mere fact that some db use mvcc and other don't be a hint that it's not sql standard defined, and thus purely something to be aware and take care of with the relevant engines? – didierc May 14 '13 at 22:06
  • Also, rdbms implementors must be very aware of the locking issue. Certainly engines are implemented to avoid the problem in trivial situations (like simple statements), whereas in transactions, the problem is a bit more complicated. – didierc May 14 '13 at 22:12
  • @didierc, not necessarily. The standard could mandate that *if* locks are used, they must be established in a certain order. MVCC would avoid establishing locks altogether. – Gili May 14 '13 at 22:42
  • Yeah, I thought about that afterward. Regarding the title of your question, didn't you mean _transaction_ rather than _query_? – didierc May 14 '13 at 22:55
  • @didierc, no. We're talking about multiple transactions here. I'm talking about multiple threads querying the same tables, in a different order, at the same time, thereby causing a deadlock. – Gili May 14 '13 at 23:35
  • If you consider 2 queries both accessing table A and B, but such that both are composed of a main query on one table, and a subquery on the other table, and that the first uses A as its main table, and the second uses B. These two queries cannot be run together because of lock, and may not necessarily be rewritten to access tables in the same order. How can we ensure that both queries lock the tables in the same order? SQL does not provide the means to do that. It's up to the DB to take care of that. – didierc May 14 '13 at 23:43

2 Answers2

5

According to https://stackoverflow.com/a/112256/14731 lock order is determined by the implementation-specific execution order. The answer further goes on to say that there isn't a deterministic way to prevent deadlocks. Whereas in imperative programming we can prevent deadlocks by acquiring locks in the same order, it seems that in declarative systems we have to work around them by retrying the operation when a deadlock is detected.

Furthermore, I argue that since database execution plans change over their lifetime it is technically impossible to prevent deadlocks.

Community
  • 1
  • 1
Gili
  • 86,244
  • 97
  • 390
  • 689
  • Note that the phrase "when you access the tables in the same order" in the answer that you mentioned refers to the order of update statements. The order of tables in a join query, won't have much effect on the probability of a deadlock, because the query only creates S locks. Two transactions running the same query won't lock one another. – nakosspy May 14 '13 at 23:58
  • @nakosspy, two transaction querying the same set of tables but in a different order could cause a deadlock (at least, for a naive DB implementation). In other words, I agree with you that running the same query won't cause a deadlock, but the point is that even `SELECT` without an `UPDATE` can cause a deadlock. – Gili May 15 '13 at 00:19
  • 2
    What do you mean a naive database implementation? A DB that would acquire exclusive locks for a query? I don't know any database that does this. However in the case that you have to deal with such a "naive" database, then the major problem would be data integrity and performance, not deadlocks. – nakosspy May 15 '13 at 00:28
  • @nakosspy, my mistake. You are right. I didn't realize that "S locks" meant read locks. Yes, if you only ever `SELECT` then locking order does not matter (but realistically speaking, everyone needs to `INSERT` and `UPDATE`). The minute you introduce even a single statement that uses write-locks it has the potential of triggering a deadlock (meaning, a `SELECT` might be waiting on locks held by the `UPDATE` and vice-versa). – Gili May 15 '13 at 15:53
1

I can give you an answer for DB2, but I think that this should be similar for other databases as well. First of all, everything depends on the locksize parameter of your tables. This parameter defines what is being locked. You can have locksize = table, page or row. So, depending on locksize of each table, the database will lock the object (table, page or row) that is used to fetch data for the cursor. So the order of locks being created will be specified by the access path, which depends on the optimizer.

nakosspy
  • 3,904
  • 1
  • 26
  • 31
  • So you're saying it depends on the implementation-specific execution plan? If that's the case, what about question 3? – Gili May 14 '13 at 21:01
  • First of all these locks are S locks. This means that no other transaction is allowed to update the pages (or rows) used by the cursor. If you open a cursor and your isolation level is read stability, then nobody else is allowed to modify the data that your cursor reads. Your transaction is allowed to modify these data. If you want to minimize deadlocks then prefer row lock over page or table and set the isolation level to cursor stability. – nakosspy May 14 '13 at 23:51