1

Table A (table to merge into) has 90,000 rows Table B (source table) has 3,677 rows I would expect this to merge really quick but it's taking 30 minutes (and counting). How can it be optimized to run faster?

ALTER PROCEDURE [dbo].[MergeAddressFromGraph] 
-- no params

AS BEGIN -- SET NOCOUNT ON added to prevent extra result sets from -- interfering with SELECT statements. SET NOCOUNT ON;

-- first add fids to the MergeFundraiserNameAddress table instead of the temp table?

SELECT fundraiserid, ein
INTO #fids 
FROM  bb02_fundraiser

BEGIN TRAN;
MERGE BB02_FundraiserNameAddress AS T
USING
(    
    select f.fundraiserid,
           n.addresslines,
           n.town,
           n.county,
           n.postcode,
           n.country,
           n.fulladdress,
           n.ein
    from MergeFundraiserNameAddress n
         join bb02_fundraiser f
         on f.ein = n.ein and f.isdefault = 1
    group by n.ein,
             f.fundraiserid,
             n.addresslines,
             n.town,
             n.county,
             n.postcode,
             n.country,
             n.fulladdress

) AS S
ON (T.fundraiserid in( (select fundraiserid from #fids where ein = S.ein)) )

WHEN MATCHED
    THEN UPDATE
        SET    
              -- ADDRESS
              T.addresslines = S.addresslines
              ,T.town = S.town
              ,T.county = S.county
              ,T.postcode = S.postcode
              ,T.country = S.country
              ,T.fulladdress = S.fulladdress

;

DELETE FROM MergeFundraiserNameAddress

COMMIT TRAN;

drop table #fids

END

UPDATE I was able to improve the stored procedure which now runs in just a few seconds. I joined on the temp table instead of the bb02_fundraiser table and removed the subquery in the ON clause.

I realize now that the Merge is not necessary and I could have used an Update instead, but I'm ok with this right now because an INSERT may be needed soon in a refactor.

UPDATED STORED PROCEDURE BELOW IF OBJECT_ID('tempdb..#fids') IS NOT NULL DROP TABLE #fids

SELECT fundraiserid, ein
INTO #fids 
FROM  bb02_fundraiser
where isdefault = 1

BEGIN TRAN;
MERGE BB02_FundraiserNameAddress AS T
USING
(    
    select f.fundraiserid,
           n.addresslines,
           n.town,
           n.county,
           n.postcode,
           n.country,
           n.fulladdress,
           n.ein
    from MergeFundraiserNameAddress n
         join #fids f
         on f.ein = n.ein
    group by n.ein,
             f.fundraiserid,
             n.addresslines,
             n.town,
             n.county,
             n.postcode,
             n.country,
             n.fulladdress

) AS S
ON (T.fundraiserid = S.fundraiserid)

WHEN MATCHED
    THEN UPDATE
        SET    
              -- ADDRESS
              T.addresslines = S.addresslines
              ,T.town = S.town
              ,T.county = S.county
              ,T.postcode = S.postcode
              ,T.country = S.country
              ,T.fulladdress = S.fulladdress

;

DELETE FROM MergeFundraiserNameAddress

COMMIT TRAN;

IF OBJECT_ID('tempdb..#fids') IS NOT NULL
DROP TABLE #fids
Induster
  • 733
  • 1
  • 6
  • 15
  • Please post the execution plan. You're right - we can get this to be very quick. – usr Dec 12 '13 at 21:51
  • 2
    Why are you using MERGE if the only possible outcome is an UPDATE? ([Here's why you might want to use caution with MERGE](http://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/).) – Aaron Bertrand Dec 12 '13 at 21:55
  • Also, perhaps it is being blocked. When this is running for 30 minutes, what does sys.dm_exec_requests say? This DMV can show you blocking_session_id as well as wait type for the session_id trying to run the MERGE. – Aaron Bertrand Dec 12 '13 at 21:56

2 Answers2

0

See below if this statement alone does the job for you.

UPDATE T
SET    T.addresslines = n.addresslines
      ,T.town = n.town
      ,T.county = n.county
      ,T.postcode = n.postcode
      ,T.country = n.country
      ,T.fulladdress = n.fulladdress
    from MergeFundraiserNameAddress n join bb02_fundraiser f
    on f.ein = n.ein and f.isdefault = 1
    INNER JOIN  BB02_FundraiserNameAddress T
    ON T.fundraiserid = f.fundraiserid AND T.ein = f.ein
group by n.ein,
             f.fundraiserid,
             n.addresslines,
             n.town,
             n.county,
             n.postcode,
             n.country,
             n.fulladdress

As other users has mentioned in your comments, why use MERGE statement when you're only updating records. MERGE statement is used when you are doing multiple operation such as UPDATE , DELETE and INSERT.

Since you are only UPDATING records there is no need for merge statement.

Reason For Slow Performance

Since you are getting all the records in a Temp table and then joining it with other tables and not creating any indexes on that Temp table, The absence of any indexes will hurt the query performance.

When you do a SELECT * INTO #TempTable FROM Some_Table it will bring all the data from Some_Table into a Temp table but not the indexes. you can see your self by running this simple query

select * from tempdb.sys.indexes
where object_id = (select object_id 
                   from tempdb.sys.objects 
                   where name LIKE '#TempTable%')
M.Ali
  • 67,945
  • 13
  • 101
  • 127
  • Nice suggestion but an index on a 3,000 row table will make no difference – Nick.Mc Dec 12 '13 at 23:13
  • @ElectricLlama True but if the OP is complaining about the performance, the very 1st suspect will be that Temp table in his code. – M.Ali Dec 12 '13 at 23:17
  • I don't agree, we have a 90,000 row table being updated from a 3,677 row table, my suspicion would be on the 90,000 row table. The suspicion could be turned into certainty by observing the query plan and I/O etc. – Nick.Mc Dec 12 '13 at 23:45
  • @ElectricLlama I agree with you, but why you are adamant that user will have round about 3000 rows in his temp table. Obviously in case of a slow query anyone would see the execution plan 1st before creating indexes or anything, but in this particular case when no execution plan is provided, im only guessing, And my guess is as good as yours since neither you nor me has any number of rows or execution plan for this query :) – M.Ali Dec 12 '13 at 23:50
  • He mentioned the rowcount in his comments. Certainly the rowcount could change in which case an index is definitely worthwhile. – Nick.Mc Dec 12 '13 at 23:53
  • @ElectricLlama lol opsss just saw the row count now thanks for pointing it out :) – M.Ali Dec 12 '13 at 23:54
  • @ElectricLlama yea probably with a down vote lol, but I have simplified the update query a lot, so hopefully OP will see some improvement in performance. – M.Ali Dec 12 '13 at 23:59
  • Yes by all means you have definitely assisted by rewriting it for him. – Nick.Mc Dec 13 '13 at 00:03
  • Apologies for the late response. Been a busy day to say the least. Although I haven't tested your rewrite, just by looking at it I know it works. The only change I need to make to it is to JOIN on the temp table (#fids) instead of bb02_fundraiser. However, I'm sticking with the merge for now because I was able to make it work (see my update above). – Induster Dec 13 '13 at 07:19
0

also why delete when you can truncate.

truncate table MergeFundraiserNameAddress

KumarHarsh
  • 5,046
  • 1
  • 18
  • 22
  • I'm not a SQL expert so there are still many things I'm not aware of. Thanks for the suggestion. I looked up truncate vs delete and see what you mean. http://stackoverflow.com/questions/3256242/pros-cons-of-truncate-vs-delete-from – Induster Dec 13 '13 at 17:28