1

I'm experiencing deadlocks from my SQL statement in which I want to select an ID if it exists, else insert and then select it. I'm using double checked locking to prevent locking overhead, as suggested here.

Obviously I'm doing this to support concurrent inserts, and I'm running multiple threads. My SQL know-how is very low, so I might have missed something basic about locking? Here's my procedure:

CREATE PROCEDURE InsertAndOrSelectZipCity
@PostalDistrict nvarchar(25),
@CityName nvarchar(34),
@MunicipalityId smallint,
@ZipCode smallint
AS

DECLARE @id AS INT
SELECT @id = ZipCityId FROM ZipCity (NOLOCK) WHERE MunicipalityID=@MunicipalityId AND ZipCode=@ZipCode
IF @id IS NULL
BEGIN
    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
    BEGIN TRANSACTION
        SELECT @id = ZipCityId FROM ZipCity WHERE MunicipalityID=@MunicipalityId AND ZipCode=@ZipCode
        IF @id IS NULL
        BEGIN
           INSERT INTO ZipCity (PostalDistrict, CityName, MunicipalityId, ZipCode) VALUES (@PostalDistrict, @CityName, @MunicipalityId, @ZipCode)
           SELECT @id = SCOPE_IDENTITY()
        END
    COMMIT TRANSACTION
END
SELECT @id

UPDATE

This is fixed by using the appropriate locks (XLOCK, ROWLOCK, HOLDLOCK) on the Select statement inside the transation.

Below is the procedure written using the MERGE statement instead, no transactions needed:

DECLARE @id as INT

MERGE INTO ZipCity WITH (TABLOCK) AS Target
USING (SELECT @PostalDistrict, @CityName, @MunicipalityId, @ZipCode) AS Source (PostalDistrict, CityName, MunicipalityId, ZipCode)
ON Target.MunicipalityId = Source.MunicipalityId AND Target.ZipCode = Source.ZipCode
WHEN MATCHED THEN
    UPDATE SET @id = Target.ZipCityId
WHEN NOT MATCHED THEN
    INSERT (PostalDistrict, CityName, MunicipalityId, ZipCode) VALUES (@PostalDistrict, @CityName, @MunicipalityId, @ZipCode)
OUTPUT INSERTED.ZipCityId;
Community
  • 1
  • 1
Jeppebm
  • 389
  • 1
  • 3
  • 16

2 Answers2

2
    SELECT @id = ZipCityId 
    FROM ZipCity 
    WHERE MunicipalityID=@MunicipalityId 
      AND ZipCode=@ZipCode

Here your select is acquiring an S-lock. This can happen with multiple threads. Later, the insert tries to X-lock which is a deadlock.

Acquire the X-lock straight away:

    SELECT @id = ZipCityId 
    FROM ZipCity WITH (XLOCK, ROWLOCK, HOLDLOCK) ...

Here, ROWLOCK, HOLDLOCK are not strictly required but the pattern XLOCK, ROWLOCK, HOLDLOCK is pretty standard and I try to follow it everywhere for consistency.

Btw, you might want to switch to a MERGE statement. I think it will acquire U-locks automatically so no locking hints are required for it. Not sure about that, though. In any case it would be a code improvement as well as a performance improvement.

Hawk
  • 5,060
  • 12
  • 49
  • 74
usr
  • 168,620
  • 35
  • 240
  • 369
  • That did the trick - thank you! For the MERGE suggestion, wouldn't it be overhead to have to use the UPDATE statement if my data is static? E.g. each ZipCity row is always the same. – Jeppebm Nov 28 '13 at 11:02
  • 1
    MERGE can contain any combination of clauses you want. You can have just a WHEN NOT MATCHED THEN INSERT and nothing else. Merge is a superset of all other DML statements. – usr Nov 28 '13 at 11:05
  • 1
    Brilliant. Using MERGE, would I then just have to SELECT (NOLOCK) ZipCityID afterwards, to achieve the same output and functionality? – Jeppebm Nov 28 '13 at 11:21
  • You use the OUTPUT clause: `OUTPUT INSERTED.ZipCityId`. That emits a result set for you. You could also stream the results into a table variable or temp table. MERGE is the total solution for problems like this one. SCOPE_IDENTITY not needed (because merge can emit multiple IDs at once a single value cannot possibly suffice). – usr Nov 28 '13 at 11:22
  • Wouldn't OUTPUT only allow me to get the ID when something is inserted, and not when the row already exists? MERGE definitely seems the way to go. – Jeppebm Nov 28 '13 at 11:25
  • Hm that's right. Because the write part of your double-checked locking is executed infrequently you could indeed just do a dummy UPDATE to cause the row to be emitted. You can even find out which case was hit using `OUTPUT $ACTION`. – usr Nov 28 '13 at 11:30
  • 1
    I just tested that you can extract the existing ID using this: `WHEN MATCHED THEN UPDATE SET @id = target.ID` where `@id` is some declared variable. – usr Nov 28 '13 at 11:36
  • Alright, that seems to work. I had to OUTPUT into a declared table, and update a declared scalar as suggested. I don't know how to do this using only one declared scalar/table, so I'm using a an IF ELSE clause at the end to SELECT from the scalar if not null, otherwise from the table. Is this optimal? – Jeppebm Nov 28 '13 at 12:09
  • If you do the update-to-scalar variable trick you can just use the OUTPUT clause to directly emit the result set to the client because the dummy update will emit a row. The scalar variable would be unused - just a dummy to force an update. – usr Nov 28 '13 at 12:17
  • If you really want good performance you should probably rewrite the proc to support many zip codes at once using table-valued-parameters. This is of course only doable if it fits into the application architecture. That way I can easily see you can 10x or more in throughput if you process hundreds of rows at once. – usr Nov 28 '13 at 12:19
0

Well, it is hard to reproduce your scenario but it does look interesting.

Try using ROWLOCK i.e. doing something like this

SELECT @id = ZipCityId FROM ZipCity WITH(ROWLOCK) WHERE MunicipalityID=@MunicipalityId AND ZipCode=@ZipCode

and see if it helps (I hope it does).

Also, you might want to check this article and see if it's relevant for your scenario. Seems to me it is.

http://support.microsoft.com/kb/323630

peter.petrov
  • 38,363
  • 16
  • 94
  • 159
  • Still getting deadlocks using only rowlock. usr's answer using (XLOCK, ROWLOCK, HOLDLOCK) seems to work. Thank you for the link. – Jeppebm Nov 28 '13 at 11:07