5

I know that UNION ALL supposed to have better performance than UNION (see: performance of union versus union all).

Now, I have this huge stored procedure (with a lot of queries), where The final result is a two section SELECT with a UNION between them. Since both data sets are foreign for each other, I can use UNION ALL which suppose to be better (no distinct operation).

I've check it on several databases and it worked fine. The problem is that one of my customers give me his database for performance tuning, and when I investigated it I noticed that if I'm changing the UNION ALL to UNION the performance is a bit better(!). This is all the change I've done in the stored procedure.

Can someone please explain how this situation can occur???

Thanks,
Ziv

Update:
attached execution plan of both queries (the diff part): enter image description here

Community
  • 1
  • 1
  • Did you check the execution plan in both case? – András Ottó Sep 12 '12 at 07:56
  • @AndrásOttó How can you retrieve an execution plan for stored procedure? – Dor Cohen Sep 12 '12 at 07:59
  • @DorCohen: Run the PRC inside the Management Studio and there you can get a graphical execution plan (there are some icon for do this), and for further info you can check: http://msdn.microsoft.com/en-us/library/ms178071(v=sql.105).aspx – András Ottó Sep 12 '12 at 08:01
  • @AndrásOttó I attached the diff between the execution plans for both queries –  Sep 12 '12 at 09:22
  • why do you said that is bettter, if in the image you attached, the information of cpu cost and another things are exactly the same? – Gonzalo.- Sep 12 '12 at 13:01
  • @ElVieejo It was one second better (1:17 against 1:16) –  Sep 13 '12 at 06:58

2 Answers2

1

You referenced an other topic which pointed to this article.

If you check this one here is two different execution plan. The big difference is the Distinct Sort which made the worser performance.

In your example the two execution plan has the same step with a physical operation Merge Join (only the logical operations are different). Even the estimations are the same.

Now I'm really courios: how big is the difference between the two query?

If you did not do the following, pls repeat your test again:
1) use the following line before run the PRC:

DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS 

Which make the cache clear and you can make a "cold run" in both case. You can check an other article here too.

2) Repeate the run several times to see an avarage.

Is the difference still exists?

András Ottó
  • 7,605
  • 1
  • 28
  • 38
  • It was too much for a comment, so that's why it is an answer, but I have a feeling, that there is no difference seeing the execution plans. – András Ottó Sep 12 '12 at 10:19
  • I've cleared the cache with the dbcc procedures before my tests. Strange as it is when I execute it several times the diff is gone... I'll add a comment if I'll find anything else. –  Sep 12 '12 at 12:34
0

This can occur if you have some duplicate rows. A UNION statement effectively does a SELECT DISTINCT on the results set. If you know that all the records returned are unique from your union, use UNION ALL as it gives faster results. However, for the duplicates in your case, I would guess at there being a sufficent number of duplicates as to make the UNION quicker - you can test this counting the duplicates and removing them. Then running the UNION ALL may return to being the 'winner'...

I hope this helps.

MoonKnight
  • 23,214
  • 40
  • 145
  • 277