9

I am issuing a single SQL query consisting of multiple SELECTs grouped using UNION:

SELECT *
FROM   employee 
       LEFT JOIN department 
          ON employee.DepartmentID = department.DepartmentID
UNION
SELECT *
FROM   employee
       RIGHT JOIN department
          ON employee.DepartmentID = department.DepartmentID;

Assuming I execute this query under READ_COMMITTED transaction isolation, are the two SELECT statements guaranteed to execute atomically? Or do I run the risk of data changing between individual SELECT statements? Does the SQL specification discuss this sort of thing?

CLARIFICATION: When I say "Atomic" I don't mean the "A" in ACID. I mean that I expect both department and employee tables to be read-locked until the query completes.

Gili
  • 86,244
  • 97
  • 390
  • 689

3 Answers3

4

Yes the statement is atomic but yes the data can change between the 2 reads.

Read Committed only guarantees that you don't read dirty data it promises nothing else about consistency of reads for that you would need a higher isolation level.

As you said that you would accept a SQL Server Example...

Connection 1

(Assumes under pessimistic read committed isolation level)

CREATE TABLE employee
(
name VARCHAR(50),
DepartmentID INT
)

CREATE TABLE department
(
DepartmentID INT
)

INSERT INTO department VALUES (1)
INSERT INTO employee VALUES ('bob',1)

declare @employee TABLE
(
name VARCHAR(50),
DepartmentID INT
)


WHILE ((SELECT COUNT(*) FROM @employee) < 2)
BEGIN
DELETE FROM  @employee

INSERT INTO @employee
SELECT employee.*
FROM   employee 
       LEFT JOIN department 
          ON employee.DepartmentID = department.DepartmentID
UNION
SELECT employee.*
FROM   employee
       RIGHT JOIN department
          ON employee.DepartmentID = department.DepartmentID

END;          

SELECT * FROM @employee

Connection 2

while (1=1)
UPDATE employee SET name = CASE WHEN name = 'bob' THEN 'bill' else 'bob' END

Now go back to connection 1

name                                               DepartmentID
-------------------------------------------------- ------------
bill                                               1
bob                                                1

(Remember to switch back to Connection 2 to kill it!)

The specific documentation covering this READ COMMITED behaviour is here

The shared lock type determines when it will be released. Row locks are released before the next row is processed. Page locks are released when the next page is read, and table locks are released when the statement finishes.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • @Martin in that case how can a sproc be considered atomic? – Matthew Apr 08 '11 at 16:53
  • @Martin: if I understand you correctly, each SELECT is guaranteed to be executed atomically, but the overall query is not. How then can you say the statement is atomic if its sub-parts are not guaranteed to be consistent with respect to one another? – Gili Apr 08 '11 at 17:05
  • @Gili - I meant that they will meet the `A` in `ACID`. `Isolation` level [is a different thing](http://en.wikipedia.org/wiki/ACID) – Martin Smith Apr 08 '11 at 17:10
  • @Martin: I clarified the question. When I wrote "atomic" I didn't mean the A in Acid. – Gili Apr 08 '11 at 17:17
  • @Gili - Well I can confirm in SQL Server there is definitely no such guarantee under pessimistic `read committed` isolation level you would need a different isolation level to achieve this. Will probably delete this answer shortly though as I know nothing about either H2 or what the SQL Standard says on the matter! – Martin Smith Apr 08 '11 at 17:20
  • @Martin: I wouldn't delete this answer. I'll mark it as accepted as soon as you provide me with some sort of official documentation to back up your claim for SQL Server or SQL in general. – Gili Apr 08 '11 at 17:42
  • The comments in http://stackoverflow.com/a/16392399/14731 say that UNION is atomic after all. So who is right? – Gili Sep 15 '13 at 19:48
  • @Gili - The first line in this answer says it is atomic as well. But all atomic means is that the transaction either all succeeds or all fails. You won't end up with a credit being committed without the corresponding debit for example. It doesn't say anything at all about whether you see the effects of concurrent transactions. That is controlled by isolation. – Martin Smith Sep 15 '13 at 20:00
  • @MartinSmith, so my question is flawed because I actually meant to ask about Isolation not Atomic. Stupid mistake on my behalf... Thanks for the clarification. – Gili Sep 15 '13 at 20:15
1

Using UNION will remove any duplicate records that may be returned from either of the unioned queries, so not exactly atomic. Use UNION ALL if you want all records from all unioned queries. UNION ALL can be much faster that UNION also.

Parris Varney
  • 11,320
  • 12
  • 47
  • 76
  • `UNION ALL` is significantly faster, but in this case in one of the joins you'd need a condition that filters out the `INNER JOIN` records, or else they will be duplicated. – btilly Apr 08 '11 at 16:44
  • The question is mostly concerned with correctness, not performance. I am trying to do a FULL OUTER JOIN but my database (H2) does not support it. – Gili Apr 08 '11 at 16:49
1

EDIT: Note that my answer is incorrect but I do not want to delete it because I think it links to good questions and has good comments.

Every individual transaction is atomic.

A UNION using multiple subqueries is a single T-SQL command, a single transaction, and will be atomic.

This is, in part, a reason to avoid inefficient queries (or sprocs, for that matter) as their atomic nature can delay other transactions.

EDIT: Please see this question for more interesting information on atomicity of subqueries

Is update with nested select atomic operation?

EDIT: Apparently I am wrong.

This is a good discussion on the topic: Atomic UPSERT in SQL Server 2005 where Remus poses a good example. Sorry for doubting you, Martin....

Community
  • 1
  • 1
Matthew
  • 10,244
  • 5
  • 49
  • 104
  • @Matthew: How can you say that "every individual transaction is atomic" when READ_COMMITTED allows non-repeatable reads? – Gili Apr 08 '11 at 16:48
  • @Gili the read lock applies to the entire transaction... I don't know whether you can explicitly state otherwise for the subqueries (though that seems dangerous to me) – Matthew Apr 08 '11 at 16:52
  • 2
    @MatthewPK - Don't know anything about `H2` but for SQL Server under read committed isolation level shared locks are released as soon as the page is read. It is perfectly possible for self joins etc to have 2 different versions of the same row. – Martin Smith Apr 08 '11 at 16:53
  • @Martin, that may be true but the next read in queue would then be the next subquery in the transaction. So unless some nested query caused the row to be altered in scope I should not expect it to change. – Matthew Apr 08 '11 at 16:56
  • Can you please link to an official document/specification that backs up what you're saying? – Gili Apr 08 '11 at 17:02
  • @Matthew - Under conditions of low concurrency it may well happen to be "the next read in queue" before any updates/inserts happen but there is absolutely no guarantee of this. Under such low concurrency conditions this whole question becomes pointless anyway. – Martin Smith Apr 08 '11 at 17:05
  • I'll continue looking for documentation to support this, though here's a related SO answer: http://stackoverflow.com/questions/4097126/t-sql-is-a-sub-query-for-an-update-restriction-atomic-with-the-update and I agree that I've always known transactions to be atomic... if I am wrong, Martin, you will have unwound one of my very fundamental beliefs about SQL Server... :-) – Matthew Apr 08 '11 at 17:23
  • @Matthew - Ah just found the answer I was searching for only to realise that it isn't relevant to this question at all as it is talking about `nolock` [but still interesting none-the-less](http://stackoverflow.com/questions/5289683/sql-server-2008-r2-dirty-reads-how-non-atomic) – Martin Smith Apr 08 '11 at 17:44