Preventing Conditional INSERT/UPDATE Race Condition in MS-SQL

Question

I wonder that do i follow correct approach and need your help to figure out

Here my non-protected query

DECLARE @cl_WordId bigint = NULL
SELECT
  @cl_WordId = cl_WordId
FROM tblWords
WHERE cl_Word = @cl_Word
AND cl_WordLangCode = @cl_WordLangCode
IF (@cl_WordId IS NULL)
BEGIN
  INSERT INTO tblWords (cl_Word, cl_WordLangCode, cl_SourceId)
    VALUES (@cl_Word, @cl_WordLangCode, @cl_SourceId)
  SET @cl_WordId = SCOPE_IDENTITY()
  SELECT
    @cl_WordId
END
ELSE
BEGIN
  SELECT
    @cl_WordId
END

And to protect it, i modify it as below

DECLARE @cl_WordId bigint = NULL
SELECT
  @cl_WordId = cl_WordId
FROM tblWords WITH (HOLDLOCK)
WHERE cl_Word = @cl_Word
AND cl_WordLangCode = @cl_WordLangCode
BEGIN
  IF (@cl_WordId IS NULL)
  BEGIN
    INSERT INTO tblWords (cl_Word, cl_WordLangCode, cl_SourceId)
      VALUES (@cl_Word, @cl_WordLangCode, @cl_SourceId)
    SET @cl_WordId = SCOPE_IDENTITY()
    SELECT
      @cl_WordId
  END
  ELSE
  BEGIN
    SELECT
      @cl_WordId
  END
END

So i have added WITH (HOLDLOCK) to the select query and added begin and end to the select query

Is this approach correct to prevent Conditional INSERT/UPDATE Race Condition

I'm guessing isolation mode serializable is the default for SQL Server so unless you need table locks go with first solution. Is there a problem? This may also be of interest http://stackoverflow.com/questions/1197733/does-sql-server-offer-anything-like-mysqls-on-duplicate-key-update — McMurphy, May 09 '17 at 11:59
@McMurphy in first mode i am getting errors sometimes. because another thread being already inserted :D — Furkan Gözükara, May 09 '17 at 12:02
Couldn't hurt to have explicit transactions and commits: - https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql — McMurphy, May 09 '17 at 12:14
@McMurphy dont you think WITH (HOLDLOCK) would solve the problem? — Furkan Gözükara, May 09 '17 at 12:22
Read the articles linked in [this comment](http://stackoverflow.com/questions/38497259/what-is-the-best-practice-for-inserting-a-record-if-it-doesnt-already-exist#comment64601531_38497259) — Zohar Peled, May 09 '17 at 12:26
@MonsterMMORPG It might -- but I've been writing SQL for 3 years and I've never once needed to use `WITH (HOLDLOCK)`. In my experience, using transactions is the more 'canonical' solution — user1935361, May 09 '17 at 12:29
@user1935361 merge is still vulnerable to duplicate records : http://weblogs.sqlteam.com/dang/archive/2007/10/28/Conditional-INSERTUPDATE-Race-Condition.aspx — Furkan Gözükara, May 09 '17 at 12:35
@MonsterMMORPG Wow that's news to me...edited my comment for accuracy. I never liked the syntax anyway :) — user1935361, May 09 '17 at 12:37
Yes. Use a transaction. Do you have a reason for not using a transaction? — Jonny, May 09 '17 at 14:15

score 3 · Accepted Answer · edited Jul 29 '20 at 18:26

As alluded to in the articles I posted to your last question (Conditional INSERT/UPDATE Race Condition and “UPSERT” Race Condition With MERGE) using MERGE along with HOLDLOCK is thread safe, so your query would be:

MERGE tblWords WITH (HOLDLOCK) AS w
USING (VALUES (@cl_Word, @cl_WordLangCode, @cl_SourceId)) AS s (cl_Word, cl_WordLangCode, cl_SourceId)
    ON s.cl_Word = w.cl_Word
    AND s.cl_WordLangCode = w.cl_WordLangCode
WHEN NOT MATCHED THEN 
    INSERT (cl_Word, cl_WordLangCode, cl_SourceId)
    VALUES (s.cl_Word, s.cl_WordLangCode, s.cl_SourceId);

It also looks like this might be a stored procedure and you are using SELECT @cl_WordId to return the ID to the caller. This falls under one of Aaron Bertrand's bad habits to kick, instead you should use an output parameter, something like:

CREATE PROCEDURE dbo.SaveCLWord
        @cl_Word            VARCHAR(255), 
        @cl_WordLangCode    VARCHAR(255), 
        @cl_SourceId        INT,
        @cl_WordId          INT OUTPUT
AS
BEGIN

    MERGE tblWords WITH (HOLDLOCK) AS w
    USING (VALUES (@cl_Word, @cl_WordLangCode, @cl_SourceId)) AS s (cl_Word, cl_WordLangCode, cl_SourceId)
        ON s.cl_Word = w.cl_Word
        AND s.cl_WordLangCode = w.cl_WordLangCode
    WHEN NOT MATCHED THEN 
        INSERT (cl_Word, cl_WordLangCode, cl_SourceId)
        VALUES (s.cl_Word, s.cl_WordLangCode, s.cl_SourceId);

    SELECT  @cl_WordId = w.cl_WordId
    FROM    tblWords AS w
    WHERE   s.cl_Word = @cl_Word
    AND     s.cl_WordLangCode = @cl_WordLangCode;

END

ADDEDNUM

You can do this without MERGE as follows.

BEGIN TRAN

INSERT tblWords (cl_Word, cl_WordLangCode, cl_SourceId)
SELECT  @cl_Word, @cl_WordLangCode, @cl_SourceId
WHERE   NOT EXISTS
        (   SELECT  1
            FROM    tblWords WITH (UPDLOCK, HOLDLOCK)
            WHERE   cl_Word = @cl_Word
            AND     l_WordLangCode = @cl_WordLangCode
        );

COMMIT TRAN;

SELECT  @cl_WordId = w.cl_WordId
FROM    tblWords AS w
WHERE   s.cl_Word = @cl_Word
AND     s.cl_WordLangCode = @cl_WordLangCode;

If you are not using merge because you are concerned about its bugs, or because in this case you don't actually do an UPDATE, so MERGE is overkill and an INSERT will suffice, then that is fair enough. But not using it because it is unfamiliar syntax is not the best reason, take the time to read about it, learn more, and add another string to your SQL bow.

EDIT

From online docs

HOLDLOCK

Is equivalent to SERIALIZABLE. For more information, see SERIALIZABLE later in this topic. HOLDLOCK applies only to the table or view for which it is specified and only for the duration of the transaction defined by the statement that it is used in. HOLDLOCK cannot be used in a SELECT statement that includes the FOR BROWSE option.

So in your query, you have 6 statements:

-- STATETMENT 1
DECLARE @cl_WordId bigint = NULL

--STATEMENT 2
SELECT
  @cl_WordId = cl_WordId
FROM tblWords WITH (HOLDLOCK)
WHERE cl_Word = @cl_Word
AND cl_WordLangCode = @cl_WordLangCode

BEGIN

--STATEMENT 3
  IF (@cl_WordId IS NULL)
  BEGIN

    -- STATEMENT 4
    INSERT INTO tblWords (cl_Word, cl_WordLangCode, cl_SourceId)
      VALUES (@cl_Word, @cl_WordLangCode, @cl_SourceId)
    SET @cl_WordId = SCOPE_IDENTITY()

    --STATEMENT 5
    SELECT
      @cl_WordId
  END
  ELSE
  BEGIN

    -- STATEMENT 6
    SELECT
      @cl_WordId
  END
END

Since you don't have explicit transactions, each statement runs within its own implicit transaction, so concentrating on statement 2, this is equivalent to:

BEGIN TRAN

SELECT
  @cl_WordId = cl_WordId
FROM tblWords WITH (HOLDLOCK)
WHERE cl_Word = @cl_Word
AND cl_WordLangCode = @cl_WordLangCode

COMMIT TRAN

Therefore, since HOLDLOCK applies for the duration of the transaction in which it is used, the lock is released, the lock is released as soon as this code finishes, so by the time you have progressed to statement 3 and 4 another thread could have inserted to the table.

merge syntax is very different for me. so i prefer to not use if possible. do my protected version work fine? also any performance difference between merge and my version? by the way it is not stored procedure but parameterized query — Furkan Gözükara, May 09 '17 at 22:12
what is the logic here of using BEGIN TRAN ? what happens if we remove explicit transaction syntax ? — Furkan Gözükara, May 10 '17 at 19:51
I don't think there is actually an issue with removing the explicit transactions when doing this in a single statement (`INSERT .. WHERE NOT EXISTS..`). However, if you want to use multiple statements `IF NOT EXISTS ()... BEGIN .. END` then it is required to ensure the lock taken in the `IF EXISTS` is held until the transaction is committed (which is what `HOLDLOCK` does) — GarethD, May 11 '17 at 08:42
i see. that is why the second method in my question still caused errors right? — Furkan Gözükara, May 11 '17 at 10:01
Yes exactly, as soon as the assignment `@cl_WordId = cl_WordId` was complete, the lock would have been released, allowing an insert in the short gap between the variable assignment and the insert. This method would probably work with explicit transactions surrounding the entire batch. My preference would still be to do the upsert as a single statement though. — GarethD, May 11 '17 at 10:05
ty very much for answers. yes i am using now your ADDEDNUM method and so far no errors. but i need to understand the mechanic that lock is removed after @cl_WordId = cl_WordId completed. how does setting explicit transactions preventing that lock to be removed? and how do you know when lock is removed? — Furkan Gözükara, May 11 '17 at 11:56
I have extended the answer to explain how `HOLDLOCK` works. If this still isn't clear, then I would suggest you read some articles on isolation levels and locking. There is far more to learn that I could ever write in an answer on Stackoverflow. — GarethD, May 11 '17 at 12:32
I see. So when i define explicit transaction in the beginning, whole thing runs in a single transaction. Thus when holdlock is applied, it continue to hold until entire transaction ends right? also how do you split each statement ? by experience? — Furkan Gözükara, May 11 '17 at 15:17
yes that is right. Yes by experience, [this answer](http://stackoverflow.com/a/4735902/1048425) may be of use to you. — GarethD, May 11 '17 at 15:24

Preventing Conditional INSERT/UPDATE Race Condition in MS-SQL

1 Answers1

Linked