4

I currently have a MySQL server, but I am currently migrating to MS SQL Server.

I am trying to insert a lot (2000) or records per batch in a table in a C# application. I want to replicate MySQL's "REPLACE INTO" (UPSERT), where if the record exists, I want to UPDATE it and if it doesn't, I want to INSERT it. I currently have a statement like this:

REPLACE INTO tablename (keycol, col1, col2)
VALUES ('A','B','C'),
       ('C','D','E'),
       ('F','G','H'),
       ('I','J','K');

Some of the records will exist and others won't. Also, this statement will run on many different tables and the values are built at runtime (by looping over each field dynamically), so the structure of the table is not known at compile time.

I doubt I can efficiently use MERGE because of the fact that some of the records in the batch will exist and some won't and that since I don't know the structure of the table before hand and the fact that I am updating batches of 2000 records, I don't want to use subqueries for performance reasons.

Any help on the best solution would be greatly appreciated.

Matt Summersgill
  • 4,054
  • 18
  • 47
Vincent L
  • 699
  • 2
  • 11
  • 25
  • How do you check their existance (unique ID?). `MERGE` is meant to do exactly this... 2000 records is not *a lot*... You might `BULK` write the whole lot into a staging table and use a `UPDATE... WHERE EXISTS...` and `INSERT ... WHERE NOT EXISTS...` for all at once. But - to be honest - `MERGE` does roughly the same... – Shnugo Feb 27 '17 at 21:13
  • Basically, my C# app builds a query in a string where I can append the VALUES as such: REPLACE INTO tablename VALUES (A,B,C), (C,D,E), (F,G,H), (I,J,K) If I use MERGE, it would look like this: MERGE tablename WHEN MATCHED THEN UPDATE .... WHEN NOT MATCHED THEN INSERT (A,B,C), (C,D,E), (F,G,H), (I,J,K) I don't know how I would finish the UPDATE clause to overwrite every column of this record that matches the primary key. – Vincent L Feb 27 '17 at 21:40
  • 1
    As said before: `MERGE` was designed for exactly this kind of job. In most cases one would rather use a `SELECT` against some kind of table rather than a list of `VALUE` tupels, but this is not the point. Read about `BULK INSERT`, which is working almost instantly and then use `MERGE` to perform your `UPSERT`. – Shnugo Feb 27 '17 at 22:09
  • Isn't BULK INSERT to load data from a file? This isn't what I am doing at the moment. What I am trying to do is read records from a table A, modify some fields and put that data in table B. To put into context, table A has a timestamp and table B has a moving average at different time intervals than the ones present in table A (interpolation). Table A is updated daily. But since table B is a moving average, when I run it again the following day, the last few records might need to be updated since I now have new data in table A. So the primary key will exist and I want to ovrwrite those records – Vincent L Feb 27 '17 at 22:27
  • If you can test against a unique (primary) key `MERGE` should be the right choice... – Shnugo Feb 27 '17 at 22:55
  • Can you enlighten me on the syntax? I don't want to have MERGE tablename WHEN MATCHED THEN UPDATE (2000 records) WHEN NOT MATCHED THE INSERT (*same* 2000 records) – Vincent L Feb 28 '17 at 14:38
  • You do not need to repeat this statement for each record over and over. This should be one single call... There are tons of examples around. If you need help please read [How to ask a good SQL question](http://meta.stackoverflow.com/questions/271055/tips-for-asking-a-good-structured-query-language-sql-question/271056) and [How to create a MCVE](http://stackoverflow.com/help/mcve). I'd need a (reduced!) sample with expected output... – Shnugo Feb 28 '17 at 15:01
  • First of all, `REPLACE` is actually a `DELETE` based on the unique key(s) followed by an `INSERT`. `REPLACE` is not an UpSert, and is virtually useless. – Rick James Jul 15 '19 at 22:25
  • I guess question is not clear to anyone.`values are built at runtime,table structure is not known.` Can you show few proc paramter sample or c# string that you will send to proc.Main problem is here. – KumarHarsh Jul 17 '19 at 11:18

3 Answers3

13

MERGE was specifically designed for such scenarios:

CREATE TABLE tablename(keycol CHAR(1) PRIMARY KEY, col1 CHAR(1), col2 CHAR(1));
INSERT INTO tablename(keycol, col1, col2) VALUES('A', 'X', 'X'); -- to be updated
SELECT * FROM tablename;

MERGE tablename trg
USING (VALUES ('A','B','C'),
              ('C','D','E'),
              ('F','G','H'),
              ('I','J','K')) src(keycol, col1, col2)
  ON trg.keycol = src.keycol
WHEN MATCHED THEN
   UPDATE SET col1 = src.col1, col2 = src.col2
WHEN NOT MATCHED THEN
   INSERT(keycol, col1, col2)
   VALUES(src.keycol, src.col1, src.col2);

SELECT * FROM tablename;

db<>fiddle demo

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
1

Take a look at this.

Equivalent of MySQL ON DUPLICATE KEY UPDATE in Sql Server

I think it will solve your problem.

Community
  • 1
  • 1
  • To use this, I would have to specify all the columns I want to update though. right? Is there something like "ON DUPLICATE KEY UPDATE (whole record)? Also, since I have 2000 records per batch, I would rather not have a string that repeats the same 2000 value sets twice. i.e.: INSERT INTO tablename VALUES (A,B,C), (D,E,F), (...) ON DUPLICATE KEY UPDATE tablename VALUES (A,B,C), (D,E,F), (...) – Vincent L Feb 27 '17 at 21:50
  • @VincentL - The end of that is `... UPDATE colA = VALUES(colA), colB = VALUES(colB)...` You don't repeat the _values_, you do repeat each column _name_ twice. – Rick James Jul 15 '19 at 22:30
0

This is a perfect candidate for MERGE, as others have noted.

Here is an example. Sourcetable and Tablename have the same layout in this example, but as long as you know the match column name and the data column names, this will work.

MERGE TableName AS TARGET
USING SourceTable AS SOURCE 
ON (TARGET.Keycol = SOURCE.Keycol) 
WHEN MATCHED 
AND 
(
    TARGET.Col1 != SOURCE.Col1 OR TARGET.Col2 != SOURCE.Col2
) 
THEN 
UPDATE 
SET 
    TARGET.Col1 = SOURCE.Col1
    , TARGET.col2 = SOURCE.Col2 
WHEN NOT MATCHED BY TARGET 
THEN INSERT (keycol,col1,col2) VALUES (SOURCE.KeyCol, SOURCE.Col1, SOURCE.Col2)
;

This is all you need to do. It is easily adaptable to dynamic tables and even temp tables. Table VARIABLES are tricky but everything else is easily accomplished.

John Tamburo
  • 427
  • 4
  • 12