3

So I'm trying, in a single query, to only insert a row if it doesn't exist already.

My query is the following:

INSERT INTO [dbo].[users_roles] ([user_id], [role_id]) 
SELECT 29851, 1 WHERE NOT EXISTS (
  SELECT 1 FROM [dbo].[users_roles] WHERE user_id = 29851 AND role_id = 1)

Sometimes (very rarely, but still), it generates the following error:

Violation of PRIMARY KEY constraint 'PK_USERS_ROLES'. Cannot insert duplicate key in object 'dbo.users_roles'. The duplicate key value is (29851, 1).

PK_USERS_ROLES is [user_id], [role_id]. Here is the full SQL of the table's schema:

create table users_roles
(
    user_id int not null
        constraint FK_USERS_ROLES_USER
        references user,
    role_id int not null
        constraint FK_USERS_ROLES_USER_ROLE
        references user_role,
    constraint PK_USERS_ROLES
    primary key (user_id, role_id)
)

Context:

This is executed by a PHP script hosted on an Apache server, and "randomly" happens once out of hundreds of occurrences (most likely concurrency-related).

More info:

  • SELECT @@VERSION gives:

Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64) Jun 28 2012 08:36:30 Copyright (c) Microsoft Corporation Enterprise Edition (64-bit) on Windows NT 6.1 (Build 7601: Service Pack)

  • SQL Server version: SQL Server 2008 R2

  • Transaction Isolation level: ReadCommitted

  • This is executed within an explicit transaction (through PHP statements, but I figure the end result is the same)

Questions:

  • Could someone explain why/how this is happening?

  • What would be an efficient way to safely insert in one go (in other words, in a single query)? I've seen other answers such as this one but the solutions are meant for stored procedures.

Thanks.

Jeto
  • 14,596
  • 2
  • 32
  • 46
  • 2
    Well, is the primary key on just `user_id`? If so, maybe they are also a member of a role other than `1`. – Aaron Bertrand Aug 27 '18 at 12:55
  • 2
    @AaronBertrand Sorry, forgot to specify `PK_USER_ROLES`, I've just editer the question. It's composed of `[user_id], [role_id]`. – Jeto Aug 27 '18 at 12:56
  • 3
    Well, [I can't reproduce this problem](https://i.stack.imgur.com/oyCWU.png), so there must be something you're not telling us, or perhaps two different threads of the application are submitting the same query at the exact same time. What isolation level are you using? Why are you so opposed to stored procedures? They can actually help solve concurrency issues. – Aaron Bertrand Aug 27 '18 at 13:02
  • 1
    I would go for the MERGE statement, there's an example in [that SO post](https://stackoverflow.com/a/34892729/4003419). It's like an upsert without the 'up' and only the 'sert'. – LukStorms Aug 27 '18 at 13:03
  • 2
    1) same insert is being executed twice (from the same connection) 2) there are more than one connection doing the same insert – Ivan Starostin Aug 27 '18 at 13:04
  • 1
    @AaronBertrand This literally happens once out of hundreds of times, probably in a scenario where multiple requests are made at the same time. Those queries are executed by a PHP web script btw (I will also edit that into the question). – Jeto Aug 27 '18 at 13:04
  • @LukStorms `MERGE` seems like overkill for this and still won't solve the issue. Also see https://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/ – Aaron Bertrand Aug 27 '18 at 13:05
  • Such errors are regular scenarios for multiple connections running. You must handle them in your code and report to user/do something else. – Ivan Starostin Aug 27 '18 at 13:05
  • @Aaron Betrand And, nothing against stored procedures. I'm just stunned that this query can generate such an SQL error at all. If using a stored procedure was a way out, I wouldn't mind using them. I'd ideally like to understand the technical implications tho. – Jeto Aug 27 '18 at 13:06
  • @are you running query as mentioned or using dynamic query? – maulik kansara Aug 27 '18 at 13:07
  • @AaronBertrand Oh wow... I would think merge would be less buggy by now. Thanks for the insight. – LukStorms Aug 27 '18 at 13:09
  • @maulikkansara This is the exact query that is sent to the SQL Server db, many times a day. – Jeto Aug 27 '18 at 13:09
  • `Transaction Isolation level: Serializable` - how did you get this? @Jeto – Ivan Starostin Aug 27 '18 at 16:54
  • Do you have a serializable transaction that contains more than just *this statement*? – Aaron Bertrand Aug 27 '18 at 19:21
  • 1
    @IvanStarostin Actually, I asked our DB admin (who is also reading this and investigating on his own). It seems like if I run the [command posted here](https://stackoverflow.com/a/1038184/965834), I get `ReadCommitted` though. I'll edit the post and make 100% sure it's right (right context and all), sorry for the confusion. – Jeto Aug 28 '18 at 08:16
  • @AaronBertrand Yes, other `SELECT`s are made on the same (explicit) transaction (one of them on table `users_roles`). The `INSERT` I posted is the only one that isn't a SELECT tho. Let me know if you'd like more details, I'll edit the post then. – Jeto Aug 28 '18 at 08:19
  • Yes show the whole transaction, this is likely what is affecting your concurrency. – Aaron Bertrand Aug 28 '18 at 12:35

2 Answers2

1

Is this table truncated or the rows deleted in some moment? And how often? It makes sense to me that the rows should not be found in some moment, as you're running "insert if not exists", and in this moment two or more queries may hit the database to insert the same data... only one will... the other should do nothing if the row was inserted before its "not exists" look up, or fail if the row was inserted after the look up.

I have only an Oracle database right now to do some tests and I can reproduce this problem. My commit mode is explicit:

  • Create the empty table, the unique constraint and grant select, insert to another user:

    CREATE TABLE just_a_test (val NUMBER(3,0));
    
    ALTER TABLE just_a_test ADD CONSTRAINT foobar UNIQUE (val);
    
    GRANT SELECT, INSERT ON just_a_test TO user2;
    
  • DB session on user1:

    INSERT INTO just_a_test 
    SELECT 10 
    FROM DUAL 
    WHERE NOT EXISTS 
    (
      SELECT 1 
      FROM just_a_test 
      WHERE val = 10
    )
    ;
    
    -- no commit yet...
    
  • DB session on user2:

    INSERT INTO user1.just_a_test 
    SELECT 10 
    FROM DUAL 
    WHERE NOT EXISTS 
    (
      SELECT 1 
      FROM user1.just_a_test 
      WHERE val = 10
    )
    ;
    
    
    -- no commit yet, the db just hangs til the other session commit...
    

So I commit the first transaction, inserting the row, and then I get the following error on user2 session:

"unique constraint violated"
*Cause:    An UPDATE or INSERT statement attempted to insert a duplicate key.
           For Trusted Oracle configured in DBMS MAC mode, you may see
           this message if a duplicate entry exists at a different level.

Now I rollback the second transaction and run again the same insert on user2 and now I get the following output:

0 rows inserted.

Probably your scenario is just like this one. Hope it helps.

EDIT

I'm sorry. You asked two questions and I answered only the Could someone explain why/how this is happening? one. So I missed What would be an efficient way to safely insert in one go (in other words, in a single query)?.

What exactly means "safely" for you? Let's say you're running an INSERT/SELECT of lots of rows and just one of them is duplicated compared to the stored ones. For your "safety" level you should ignore all rows being inserted or ignore only the duplicated, storing the others?

Again, I don't have a SQL Server right now to give it a try, but looks like you can tell SQL Server whether to deny all rows being inserted in case of any dup or deny only the dups, keeping the others. The same is valid for an insert of a single row... if it's a dup, throw an error... or just ignore it in the other hand.

The syntax should look like this to ignore dup rows only and throw no errors:

    ALTER TABLE [TableName] REBUILD WITH (IGNORE_DUP_KEY = ON);

By default this option is OFF, which means SQL Server throws an error and discards non dup rows being inserted as well.

This way you would keep your INSERT/SELECT syntax, which looks good imo.

Hope it helps.

Sources:

https://learn.microsoft.com/en-us/sql/t-sql/statements/alter-table-index-option-transact-sql?view=sql-server-2008

https://stackoverflow.com/a/11207687/1977836

Felypp Oliveira
  • 2,147
  • 1
  • 17
  • 17
1

It might help to be explicit about this. The below runs this in an explicit transaction, locks the row explicitly.

DECLARE @user_id INT; SET @user_id=29851;
DECLARE @role_id INT; SET @role_id=1;

BEGIN TRY
    BEGIN TRANSACTION;

    DECLARE @exists INT;
    SELECT @exists=1 
    FROM [dbo].[users_roles] WITH(ROWLOCK,HOLDLOCK,XLOCK)
    WHERE user_id=@user_id AND role_id=@role_id;

    IF @exists IS NULL
    BEGIN
        INSERT INTO [dbo].[users_roles] ([user_id], [role_id])
        VALUES(@user_id,@role_id);
    END

    COMMIT TRANSACTION;
END TRY
BEGIN CATCH
    ROLLBACK TRANSACTION;
END CATCH
TT.
  • 15,774
  • 6
  • 47
  • 88
  • Thanks, we'll give this a try. Btw, seems like you deleted your above comment, I've tried answering them through an edit. Hard to figure which piece of info is relevant and which is not, sorry about that. – Jeto Aug 27 '18 at 16:21
  • @Jeto I did delete that comment as you already updated your answer. The comment then becomes irrelevant. Note that with the version you have, you seem to be missing SP3 (see https://sqlserverbuilds.blogspot.com). – TT. Aug 27 '18 at 16:24
  • Sure, just letting you know in case you missed it. As for SP3 update, I'm not responsible for the DB server/platform itself but I'll see if we can get that. – Jeto Aug 27 '18 at 16:25
  • @Jeto Did you gain any insights so far? – TT. Aug 30 '18 at 11:25
  • Unfortunately not yet. The DB server's administrator was busy all week and I'll be away from office for a few weeks, so won't be able to get back to it until then. Thank you for your help anyway (upvote was from me). – Jeto Aug 31 '18 at 10:36