0

We have a rule engine that is categorizing articles. Currently there is an issue, where we get the following error:

Cannot insert duplicate key row in object 'dbo.article_category' with unique index 'IX_article_category_no_duplicates'. The duplicate key value is (123, 456). The statement has been terminated.

The creation of the entry in article_category happens in a stored procedure. The main part of the procedure is this:

MERGE article_category AS [target]
USING (
    SELECT articleId, @categoryId, @creator, @now, @ruleId, 2 
    FROM @articleIdList
) AS [source] (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    ON (
        target.articleId = source.articleId 
        AND target.categoryId= source.categoryId
    )
WHEN NOT MATCHED THEN
    INSERT (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    VALUES (source.articleId, source.categoryId, source.creator, source.createDate, source.ruleId, source.assignmentTypeId);

When two rules try to insert the same category for the same article, this exception above happens.

How is it possible to stop this from happening? I thought that using the merge statement instead of if not exists (...) should prevent this from happening?

Are there other possibilities to prevent this exception?

Thanks in advance

xeraphim
  • 4,375
  • 9
  • 54
  • 102
  • 2
    Does `@articleIdList` have any rows which would cause the duplication? The merge clause is only checked once for the dataset, not one row at a time. – Thom A Feb 19 '19 at 12:03
  • Not sure if it's a duplicate but Aaron Bertrand's [answer to this post](https://stackoverflow.com/a/52780490/3094533) can help you understand and prevent the race-condition that happens when two inserts of the same key are executed simultaneously. – Zohar Peled Feb 19 '19 at 12:06
  • @Larnu the `@articleIdList` is unique per call of the stored procedure – xeraphim Feb 19 '19 at 14:02

3 Answers3

1

Independent from merge, if the data you are inserting/updating violates unique constraint, you can have this error on both update or insert.

the merge ... using does not handle Unique Index Constraints. Merge makes insertion if it does not match with the condition held on using statement, otherwise make update operation.

In your case,

target.articleId = source.articleId 
    AND target.categoryId= source.categoryId

does not guarantee uniqueness thus you are getting unique index contraint expception.

SMor
  • 2,830
  • 4
  • 11
  • 14
Derviş Kayımbaşıoğlu
  • 28,492
  • 4
  • 50
  • 72
0

Yes as you answered the not exists clause will help instead of merge, plus modify the stored proc in a way that any given time only one condition will have access to insert the row.. This is a classy case of dead lock

Krunal Barot
  • 914
  • 6
  • 17
0

You need to use a SOURCE that doesn't have duplicates by your UNIQUE columns. You can use a ROW_NUMBER() with a PARTITION BY <unique column> and a posterior filter to remove potentially duplicates before the actual insert:

;WITH DuplicateRanking AS
(
    SELECT 
        articleId = articleId, 
        categoryId = @categoryId, 
        creator = @creator, 
        createDate = @now, 
        ruleId = @ruleId, 
        assignmentTypeId = 2 ,
        DuplicateRanking = ROW_NUMBER() OVER (
            PARTITION BY 
                articleId       -- Your unique columns here
            ORDER BY 
                (SELECT NULL))  -- Your desired order here (will determine which row gets inserted)
    FROM 
        @articleIdList  
)
MERGE article_category AS [target]
USING DuplicateRanking AS [source]
    ON (
        target.articleId = source.articleId AND 
        target.categoryId= source.categoryId
    )
WHEN NOT MATCHED AND [source].DuplicateRanking = 1 -- Only insert 1 row per unique column set 
    THEN
    INSERT (articleId, categoryId, creator, createDate, ruleId, assignmentTypeId)
    VALUES (source.articleId, source.categoryId, source.creator, source.createDate, source.ruleId, source.assignmentTypeId);
EzLo
  • 13,780
  • 10
  • 33
  • 38
  • @xeraphim it's not possible to insert duplicates with this code unless you aren't correctly declaring the PARTITION BY column/values or there's another clause of the merge that does an update. Is your unique constraint on columns articleId and categoryId? – EzLo Feb 19 '19 at 14:32
  • yes it's on articleId and categoryId. Can the problem not be that this procedure is executed multiple times in milliseconds? – xeraphim Feb 19 '19 at 14:56
  • The merge ensures that the comparison/join is done on the same transaction and scope as the operations it will do. If its failing due to other records then there must be records already inserted with these values even if this insert fails. Do you have other operations on the same transaction? Are you sure that this merge is causing the error? – EzLo Feb 19 '19 at 14:59