Primary key conflict even when TABLOCKX and HOLDLOCK hints

Question

I have a table which is used to create locks with unique key to control execution of a critical section over multiple servers, i.e. only one thread at a time from all the web servers can enter that critical section.

The lock mechanism starts by trying to add a record to the database, and if successful it enters the region, otherwise it waits. When it exits the critical section, it removes that key from the table. I have the following procedure for this:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRANSACTION
  DECLARE @startTime DATETIME2
  DECLARE @lockStatus INT
  DECLARE @lockTime INT
  SET @startTime = GETUTCDATE()

  IF EXISTS (SELECT * FROM GuidLocks WITH (TABLOCKX, HOLDLOCK) WHERE Id = @lockName)
  BEGIN
    SET @lockStatus = 0
  END
  ELSE
  BEGIN
    INSERT INTO GuidLocks VALUES (@lockName, GETUTCDATE())
    SET @lockStatus = 1
  END

  SET @lockTime = (SELECT DATEDIFF(millisecond, @startTime, GETUTCDATE()))
  SELECT @lockStatus AS Status, @lockTime AS Duration
COMMIT TRANSACTION GetLock

So I do a SELECT on the table and use TABLOCKX and HOLDLOCK so I get an exclusive lock on the complete table and hold it until the end of the transaction. Then depending on the result, I either return fail status (0), or create a new record and return (1).

However, I am getting this exception from time to time and I just don't know how it is happening:

System.Data.SqlClient.SqlException: Violation of PRIMARY KEY constraint 'PK_GuidLocks'. Cannot insert duplicate key in object 'dbo.GuidLocks'. The duplicate key value is (XXXXXXXXX). The statement has been terminated.

Any idea how this is happening? How is it possible that two threads managed to obtain an exclusive lock on the same table and tried to insert rows at the same time?

UPDATE: It looks readers might have not fully understand my question here, so I would like to elaborate: My understanding is that using TABLOCKX obtains an exclusive lock on the table. I also understood from the documentation (and I could be mistaken) that if I use the HOLDLOCK statement, then the lock will be held till the end of the transaction, which in this case, I assume (and apparently my assumption is wrong, but that's what I understood from the documentation) is the outer transaction initiated by the BEGIN TRANSACTION statement and ended by COMMIT TRANSACTION statement. So the way I understand things here is that by the time SQL Server reach the SELECT statement having the TABLOCKX and HOLDLOCK, it will try to obtain an exclusive lock on the whole table, and will not release it until the execution of COMMIT TRANSACTION. If that's the case, how comes two threads seam to be trying to execute the same INSERT statement at the same time?

set your isolation level to SERIALIZABLE rather than READ COMMITTED; that is how one enforces serializability on a sequence of trransactions. — Pieter Geerkens, Jan 26 '16 at 18:59
@PieterGeerkens, but this is what I found in the documentation for HOLDLOCK: "Is equivalent to SERIALIZABLE. For more information, see SERIALIZABLE later in this topic. HOLDLOCK applies only to the table or view for which it is specified and only for the duration of the transaction defined by the statement that it is used in. HOLDLOCK cannot be used in a SELECT statement that includes the FOR BROWSE option." — Rafid, Jan 26 '16 at 21:52
Just in case you didn't know it exists, you might want to look into sp_getapplock: https://msdn.microsoft.com/en-us/library/ms189823.aspx — James Z, Jan 27 '16 at 16:42

score 2 · Answer 1 · answered Jan 26 '16 at 20:55

2

If you look up the documentation for tablock and holdlock, you'll see that it is not doing what you think it is:

Tablock: Specifies that the acquired lock is applied at the table level. The type of lock that is acquired depends on the statement being executed. For example, a SELECT statement may acquire a shared lock. By specifying TABLOCK, the shared lock is applied to the entire table instead of at the row or page level. If HOLDLOCK is also specified, the table lock is held until the end of the transaction.

So the reason that your query is not working is because you are only getting a shared lock from the table. What Frisbee is attempting to point out is that you don't need to re-implement all of the transaction isolating and locking code because there is a more natural syntax that handles this implicitly. His version is better than yours because it's much more easy to not make a mistake that introduces bugs.

More generally, when ordering statements in your query, you should place the statements requiring the more restrictive lock first.

answered Jan 26 '16 at 20:55

Alex Weitzer

181
1
12

1

You mean my answer with a current score of -1 – paparazzo Jan 26 '16 at 21:03
1

I'd guess that the score of your answer is based on the quality of the explanation, not the correctness of the answer. I attempted to provide what was missing from your answer, while crediting the work that you did do. – Alex Weitzer Jan 26 '16 at 21:06
1

@AlexWeitzer, thanks for your explanation. I have read the documentation actually, but the one you pasted is for TABLOCK not TABLOCKX. Notice in my query, I am using TABLOCKX, so it should obtain an exclusive lock on the table. This is what I found in the documentation: "Specifies that an exclusive lock is taken on the table." – Rafid Jan 26 '16 at 21:46
1

I am also specifying HOLDLOCK, which, if I understood correctly (apparently not), should hold the lock until the end of the transaction, which in this case is the outer transaction. – Rafid Jan 26 '16 at 21:47
So my understanding is by the time SQL reaches my SELECT statement, it will try to obtain an exclusive lock for the whole table, which can only be taken by one and one only. So even if I have thousands of threads executing the same query, only one should be able to obtain that lock and it won't release it until the execution of the statement `COMMIT TRANSACTION`. – Rafid Jan 26 '16 at 21:49

score 1 · Accepted Answer · edited May 23 '17 at 11:46

In my concurrent programming text many years ago, we read the parable of the blind train engineers who needed to transport trains both directions through a single track pass across the Andes only one track wide. In the first mutex model, an engineer would walk up to a synchronization bowl at the top of the pass and, if it was empty, place a pebble in to lock the pass. After driving through the pass he would remove his pebble to unlock the pass for the next train. This is the mutex model you have implemented and it doesn't work. In the parable a crach occurred soon after implementation, and sure enough there were two pebbles in the bowl - we have encountered a READ-READ-WRITE-WRTE anomaly due to the multi-threaded environment.

The parable then describes a second mutex model, where there is already a single pebble in the bowl. Each engineer walks up to the bowl and removes the pebble if one is there, placing it in his pocket while he drives through the pass. Then he restores the pebble to unlock the pass for the next train. If an engineer finds the bowl empty he keeps trying (or blocks for some length of time) until a pebble is available. This is the model that works.

You can implement this (correct) model by having (only ever) a single row in the GuidLocks table with a (by default) NULL value for the lock holder. In a suitable transaction each process UPDATES (in place) this single row with it's SPID exactly if the old value IS NULL; returning 1 if this succeeds and 0 if it fails. It again updates this column back to NULL when it releases the lock.

This will ensure that the resource being locked actually includes the row being modified, which in your case is clearly not always true.

See the answer by usr to this question for an interesting example.

I believe that you are being confused by the error message - clearly the engine is locating the row of a potential conflict before testing for the existence of a lock, resulting in a misleading error message, and that since (due to implementing model 1 above instead of model 2) the TABLOCK is being held on the resource used by the SELECT instead of the resource used by an INSERT/UPDATE, a second process is able to sneak in.

Note that, especially in the presence of support for snapshot isolation, the resource on which you have taken your TABLOCKX (the table snapshot before any inserts) does not guarantee to include the resource to which you have written the lock specifics (the table snapshot after an insert) .

Note the important truism that mutex guard clauses must always be of the form ***write and test*** rather than of the form ***test and write***. — Pieter Geerkens, Jan 26 '16 at 23:34
Thanks @PieterGeerkens. This makes complete sense now. Obviously, I was as confused as "usr" was in the question you posted. — Rafid, Jan 26 '16 at 23:35
P.S. - If you desire serializable behaviour, request SERIALIZABLE isolation level rather than attempting to mimic it with implementation-specific and version-specific equivalents. KISS - Keep It Simple Smartypants. — Pieter Geerkens, Jan 26 '16 at 23:36

score 0 · Answer 3 · answered Jan 26 '16 at 09:27

0

Use an app lock.

exec sp_getapplock @resource = @lockName, 
     @LockMode='Exclusive', 
     @LockOwner = 'Session';

Your approach is incorrect from many point of view: granularity (table lock), scope (transaction which commit), leakage (will leak locks). Session scope app locks is what you actually intend to use.

answered Jan 26 '16 at 09:27

Remus Rusanu

288,378
40
442
569

Can you elaborate on granularity, scope, and leakage please? – Rafid Jan 26 '16 at 09:40
Also, what will happen if my code fails and the application lock is not released? – Rafid Jan 26 '16 at 09:41
Session scoped applocks are released when the session (connection) is closed, which will happen when your application crashes. Unlike the table insert and commited approach, which will leak the lock. – Remus Rusanu Jan 26 '16 at 09:44

score -1 · Answer 4 · edited May 23 '17 at 11:46

-1

INSERT INTO GuidLocks 
select @lockName, GETUTCDATE()   
where not exists ( SELECT * 
                   FROM GuidLocks
                   WHERE Id = @lockName );
IF @@ROWCOUNT = 0 ...

to be safe about optimization

SELECT 1 
FROM GuidLocks

edited May 23 '17 at 11:46

Community

1
1

answered Jan 26 '16 at 10:30

paparazzo

44,497
23
105
176

Thanks @Frisbee. How is this different than mine? I mean I know you put it all in one statement, but I have `BEGIN TRANSACT` and `END TRANSACT`, so the `HOLDLOCK` should cause the lock to be kept for the whole transaction, no? – Rafid Jan 26 '16 at 14:32
You don't think one statement makes a difference here? – paparazzo Jan 26 '16 at 16:06
Did you read my comment? "I mean I know you put it all in one statement, but I have BEGIN TRANSACT and END TRANSACT, so the HOLDLOCK should cause the lock to be kept for the whole transaction, no?" – Rafid Jan 26 '16 at 18:15
Yes I read your question and comment. *Should* - it is not working. Did you try my solution? Did you read my comment? – paparazzo Jan 26 '16 at 18:21
Well, I really want to understand how things are working, rather than just get something from the internet without understanding it and without being able to maintain it later. If you read my question again, I am trying to understand why my code is not working. Finally, I am not sure what you mean by "did you read my comment", because I don't see any comment or explanation in your solution. The only comment you added when I asked you is "You don't think one statement...", and that really doesn't answer the question in my first comment. – Rafid Jan 26 '16 at 18:34
If you don't understand the benefits one statement or how/why it works then I can't help you. A single statement is a transaction. A single statement works even without holdlock. If you don't think you can maintain a single statement then that is your decision. – paparazzo Jan 26 '16 at 18:46
I believe `not exists(SELECT * FROM GuidLocks WHERE Id = @lockName)` should be `not exists(SELECT 1 FROM GuidLocks WHERE Id = @lockName)` in order to avoid a lookup in the syscolumns table. – Pieter Geerkens Jan 26 '16 at 18:55
@PieterGeerkens Would not hurt but I think the query optimizer is smart enough to bypass the syscolumns table with an exists. I was more mimicking the OP to minimize confusion but that is not going well. http://stackoverflow.com/questions/6137433/where-does-the-practice-exists-select-1-from-come-from – paparazzo Jan 26 '16 at 18:59
@Rafid Did you vote down a working solution? It is not wrong just because you don't understand it? – paparazzo Jan 26 '16 at 21:06
@Frisbee, two reasons for that. First, it didn't answer my question: "Any idea how this is happening? How is it possible that two threads managed to obtain an exclusive lock on the same table and tried to insert rows at the same time?" – Rafid Jan 26 '16 at 21:43
Second, and most importantly, the quality of your comments is very poor, and it feels like all you are trying to say is I am stupid. Comments like this: " If you don't understand the benefits ...... I can't help you" doesn't help at all. You might be one of the top SQL guys, and I respect that very much, but if you cannot explain your answer well (let alone making fun of the person asking the question, even though you answered a different question), then it doesn't fit in here. – Rafid Jan 26 '16 at 21:44
@Rafid: You are insisting on weaving a complicated path through a rather simple process, and resisting informed guidance that this is unwise and not necessarily correct. We can only lead you to water, not make you drink. – Pieter Geerkens Jan 26 '16 at 23:21
@PieterGeerkens, you might be experienced SQL developers. I am not, frankly, and not ashamed of that, and trying to learn. I am not trying to resist the solution here, I just want to understand SQL internals. The problem might be an easy one, but understanding the internals of SQL is what I am after, so I could apply it somewhere else, or when I want to debug other problems. – Rafid Jan 26 '16 at 23:26
I really fail to understand what is difficult about my question, honestly. I am just trying to understand why my SQL is not working as expected, and it happens everywhere on StackOverflow. People post some weird codes that will never ever be used in any production system, and the goal is to understand things. – Rafid Jan 26 '16 at 23:27
See for example this question: http://stackoverflow.com/questions/1642028/what-is-the-name-of-the-operator. Honestly, who would write (x --> 0) in C++? But the question got almost 5000 votes. – Rafid Jan 26 '16 at 23:29

Primary key conflict even when TABLOCKX and HOLDLOCK hints

4 Answers4