SQL unique field: concurrency bugs?

Question

I have a DB table with a field that must be unique. Let's say the table is called "Table1" and the unique field is called "Field1".

I plan on implementing this by performing a SELECT to see if any Table1 records exist where Field1 = @valueForField1, and only updating or inserting if no such records exist.

The problem is, how do I know there isn't a race condition here? If two users both click Save on the form that writes to Table1 (at almost the exact same time), and they have identical values for Field1, isn't it possible that the following would happen?

User1 makes a SQL call, which performs the select operation and determines there are no existing records where Field1 = @valueForField1. User1's process is preempted by User2's process, which also finds no records where Field1 = @valueForField1, and performs an insert. User1's process is allowed to run again, and inserts a second record where Field1 = @valueForField1, violating the requirement that Field1 be unique.

How can I prevent this? I'm told that transactions are atomic, but then why do we need table locks too? I've never used a lock before and I don't know whether or not I need one in this case. What happens if a process tries to write to a locked table? Will it block and try again?

I'm using MS SQL 2008R2.

score 5 · Answer 1 · answered Aug 12 '13 at 20:52

Add a unique constraint on the field. That way you won't have to SELECT. You will only have to insert. The first user will succeed the second will fail.

On top of that you may make the field autoincremented, so you won't have to care on filling it, or you may add a default value, again not caring on filling it.

Some options would be an autoincremented INT field, or a unique identifier.

score 0 · Answer 2 · answered Aug 12 '13 at 20:54

0

You can add a add a unique constraint. Example from http://www.w3schools.com/sql/sql_unique.asp:

CREATE TABLE Persons
(
P_Id int NOT NULL UNIQUE
)

answered Aug 12 '13 at 20:54

quantka

920
1
10
15

John Tseng · Answer 3 · 2013-08-13T15:32:58.183

EDIT: Please also read Martin Smith's comment below.

jyparask has a good answer on how you can tackle this specific problem. However, I would like to elaborate on your confusion over locks, transactions, blocking, and retries. For the sake of simplicity, I'm going to assume transaction isolation level serializable.

Transactions are atomic. The database guarantees that if you have two transactions, then all operations in one transaction occur completely before the next one starts, no matter what kind of race conditions there are. Even if two users access the same row at the same time (multiple cores), there is no chance of a race condition, because the database will ensure that one of them will fail.

How does the database do this? With locks. When you select a row, SQL Server will lock the row, so that all other clients will block when requesting that row. Block means that their query is paused until that row is unlocked.

The database actually has a couple of things it can lock. It can lock the row, or the table, or somewhere in between. The database decides what it thinks is best, and it's usually pretty good at it.

There is never any retrying. The database will never retry a query for you. You need to explicitly tell it to retry a query. The reason is because the correct behavior is hard to define. Should a query retry with the exact same parameters? Or should something be modified? Is it still safe to retry the query? It's much safer for the database to simply throw an exception and let you handle it.

Let's address your example. Assuming you use transactions correctly and do the right query (Martin Smith linked to a few good solutions), then the database will create the right locks so that the race condition disappears. One user will succeed, and the other will fail. In this case, there is no blocking, and no retrying.

In the general case with transactions, however, there will be blocking, and you get to implement the retrying.

`SERIALIZABLE` is the only isolation level that would prevent the race condition but it is not really as you describe. In the case that there is no matching row then there is no row to lock. The `SELECT` from both transactions can both run concurrently. As serializable prevents phantoms it needs to retain the `S` lock on at least the range where the row would be. Both can then proceed to the `INSERT` where they will deadlock. At all other isolation levels `BEGIN TRAN SELECT * FROM T WHERE C=2 INSERT INTO T VALUES(2) COMMIT` can cause duplicates. — Martin Smith, Aug 13 '13 at 06:04

SQL unique field: concurrency bugs?

3 Answers3