4

There is a lot of information that I could find on SQL Merge, but I can't seem to get this working for me. Here's what's happening.

Each day I'll be getting an Excel file uploaded to a web server with a few thousand records, each record containing 180 columns. These records contain both new information which would have to use INSERT, and updated information which will have to use UPDATE. To get the information to the database, I'm using C# to do a Bulk Copy to a temp SQL 2008 table. My plan was to then perform a Merge to get the information into the live table. The temp table doesn't have a Primary Key set, but the live table does. In the end, this is how my Merge statement would look:

MERGE Table1 WITH (HOLDLOCK) AS t1
USING (SELECT * FROM Table2) AS t2
ON t1.id = t2.id
WHEN MATCHED THEN
    UPDATE SET (t1.col1=t2.col1,t1.col2=t2.col2,...t1.colx=t2.colx)
WHEN NOT MATCHED BY TARGET THEN
    INSERT (col1,col2,...colx)
    VALUES(t2.col1,t2.col2,...t2.colx);

Even when including the HOLDLOCK, I still get the error Cannot insert duplicate key in object. From what I've read online, HOLDLOCK should allow SQL to read primary keys, but not perform any insert or update until after the task has been executed. I'm basically learning how to use MERGE on the fly, but is there something I have to enable for SQL 2008 to pick up on MERGE Locks?

Swicky
  • 73
  • 2
  • 8
  • 2
    My suggestion: just do it the old way. MERGE may seem nice but there are a lot of issues with it. http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/ – Aaron Bertrand Nov 18 '13 at 14:33
  • What is the definition of the table? Does it have more than one unique constraint? Also does the source have duplicates? – Martin Smith Nov 18 '13 at 14:34
  • 3
    SQL statements apply all row changes effectively "in parallel" - if there are two rows with the same PK value in the source table, and no row with that PK value in the target, the `MERGE` will attempt to insert *both* rows and then fail. – Damien_The_Unbeliever Nov 18 '13 at 14:37
  • @AaronBertrand I was afraid of that. I'll give it a go later and post the outcome. – Swicky Nov 18 '13 at 15:17
  • @Damien_The_Unbeliever Please let me know if I'm wrong here, but I thought HOLDLOCK was supposed to stop inserts on records with a duplicate primary key and update at the end of the query. – Swicky Nov 18 '13 at 15:20
  • 1
    No you're wrong. `HOLDLOCK` here just prevents rows being inserted in a *different* transaction between the check that the row does not exist and the attempt to insert. It does nothing to de-duplicate any repeated keys in the source. Which row do you want inserted in the event there are multiple with the same `t2.id`? – Martin Smith Nov 18 '13 at 15:50
  • Another option is to remove duplicates from the source table before MERGEing: http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows. – Mike Sep 22 '14 at 13:27

1 Answers1

2

I found a way around the problem and wanted to post the answer here, in case it helps anyone else. It looks like MERGE wouldn't work for what I needed since the temporary table being used had duplicate records that would be used as a Primary Key in the live table. The solution I came up with was to create the below stored procedure.

    -- Start with insert
    INSERT INTO LiveTable(A, B, C, D, id)
    (
    -- Filter rows to get unique id
    SELECT A, B, C, D, id FROM(
        SELECT A, B, C, D, id,
                ROW_NUMBER() OVER (PARTITION BY id ORDER BY id) AS row_number
        FROM TempTable
        WHERE NOT EXISTS(
    SELECT id FROM LiveTable WHERE LiveTable.id = TempTable.id)
        ) AS ROWS
    WHERE row_number = 1
    )

-- Continue with Update
-- Covers skipped id's during insert
UPDATE tb_TestMLS
SET
    LiveTable.A = T.A,
    LiveTable.B = T.B,
    LiveTable.C = T.C,
    LiveTable.D = T.D
FROM LiveTable L
INNER JOIN
    TempTable T
ON
    L.id= T.id
Swicky
  • 73
  • 2
  • 8