22

I have a table with some persistent data in it. Now when I query it, I also have a pretty complex CTE which computes the values required for the result and I need to insert missing rows into the persistent table. In the end I want to select the result consisting of all the rows identified by the CTE but with the data from the table if they were already in the table, and I need the information whether a row has been just inserted or not.

Simplified this works like this (the following code runs as a normal query if you like to try it):

-- Set-up of test data, this would be the persisted table 
DECLARE @target TABLE (id int NOT NULL PRIMARY KEY) ;
INSERT INTO @target (id) SELECT v.id FROM (VALUES (1), (2)) v(id);

-- START OF THE CODE IN QUESTION
-- The result table variable (will be several columns in the end)
DECLARE @result TABLE (id int NOT NULL, new bit NOT NULL) ;

WITH Source AS (
    -- Imagine a fairly expensive, recursive CTE here
    SELECT * FROM (VALUES (1), (3)) AS Source (id)
)
MERGE INTO @target AS Target
    USING Source
    ON Target.id = Source.id
    -- Perform a no-op on the match to get the output record
    WHEN MATCHED THEN 
        UPDATE SET Target.id=Target.id
    WHEN NOT MATCHED BY TARGET THEN
        INSERT (id) VALUES (SOURCE.id)
    -- select the data to be returned - will be more columns
    OUTPUT source.id, CASE WHEN $action='INSERT' THEN CONVERT(bit, 1) ELSE CONVERT(bit, 0) END
      INTO @result ;

-- Select the result
SELECT * FROM @result;

I don't like the WHEN MATCHED THEN UPDATE part, I'd rather leave the redundant update away but then I don't get the result row in the OUTPUT clause.

Is this the most efficient way to do this kind of completing and returning data?

Or would there be a more efficient solution without MERGE, for instance by pre-computing the result with a SELECT and then perform an INSERT of the rows which are new=0? I have difficulties interpreting the query plan since it basically boils down to a "Clustered Index Merge" which is pretty vague to me performance-wise compared to the separate SELECT followed by INSERT variant. And I wonder if SQL Server (2008 R2 with CU1) is actually smart enough to see that the UPDATE is a no-op (e.g. no write required).

Lucero
  • 59,176
  • 9
  • 122
  • 152
  • 1
    I have the exact same setup with the with, merge and a redundant update and am also looking for a solution that does not actually perform an update but still returns the ID. If you find a solution, please add it =) – David Mårtensson May 03 '11 at 14:54
  • RE: The Non Updating Update you would probably be better off doing a No-OP update on a column which is not part of the clustering key (if possible) as described here http://sqlblog.com/blogs/paul_white/archive/2010/08/11/the_2D00_impact_2D00_of_2D00_update_2D00_statements_2D00_that_2D00_don_2D00_t_2D00_change_2D00_data.aspx – Martin Smith May 03 '11 at 15:00
  • @David Mårtensson, thanks for adding the bounty. :) – Lucero May 03 '11 at 16:21
  • @Lucero - Let me see if I follow you. You want to insert a bunch of rows into @target and output the rows that were inserted into @result. Is that right? – Thomas May 04 '11 at 03:12
  • 1
    If I understod it he wants both the inserted ID's and the id's of rows that existed, thats where the merge comes in. Without merge you wil lnot get the id's of the other rows that already existed. The point of the question is can it be done without the actual update as that costs performance. – David Mårtensson May 04 '11 at 08:31
  • @Thomas, David got it right, I need both the new AND the existing IDs in my output. – Lucero May 04 '11 at 08:45

1 Answers1

35

You could declare a dummy variable and set its value in the WHEN MATCHED clause.

 DECLARE @dummy int;
 ...
 MERGE
 ...
 WHEN MATCHED THEN
   UPDATE SET @dummy = 0
 ...

I believe it should be less expensive than the actual table update.

Andriy M
  • 76,112
  • 17
  • 94
  • 154
  • That's a brilliant idea, and it seems to work well. I'll accept this as answer unless someone comes up with something even better... seems to take about 1/3rd of the time off the `MERGE`. – Lucero May 04 '11 at 14:15
  • This solution solved my problem at least, well worth the 100 rep bounty =) Thanks Andriy and Lucero for the answer and the good question with example. – David Mårtensson May 05 '11 at 06:43
  • @David, thanks! This is indeed a good question. Things like that are often worth knowing in advance rather than waiting until there's a real problem that needs the necessary knowledge. – Andriy M May 05 '11 at 12:11