Performance of UNION versus UNION ALL in SQL Server

Question

I have to run a SELECT statement across several tables. I am sure the tables return different records. I am anyway using UNION ALL.

Is it better to use UNION or of UNION ALL in performance terms when I am sure the tables return different records?

possible duplicate of [To union or union all, that is the question](http://stackoverflow.com/questions/3909563/to-union-or-union-all-that-is-the-question) — Lukas Eder, May 08 '12 at 12:52

score 44 · Accepted Answer · answered Sep 02 '10 at 14:15

UNION ALL will perform better than UNION when you're not concerned about eliminating duplicate records because you're avoiding an expensive distinct sort operation. See: SQL SERVER – Difference Between Union vs. Union All – Optimal Performance Comparison

score 5 · Answer 2 · answered Sep 02 '10 at 14:14

5

UNION ALL always is faster, because UNION exclude duplicated entries

answered Sep 02 '10 at 14:14

Michael Pakhantsov

24,855
6
60
59

3

It's not "always" faster. I'm doing some performance tweaking right now and finding UNION almost twice as fast as UNION ALL even though the resulting query returns the exact same set of results. This is MS SQL Server. I can't explain it and was hoping someone else could. – Emperor Eto Sep 06 '20 at 18:31
@PeterMoore Did you also have no duplicates like me? I also cannot see any speed difference between the two when checking it without any tools just on a 10s query, but there is also no reason not to believe that `union all` must be faster for large tables aso. Perhaps something that gets important only if there *are* cases that make `union` and `union all` return different results = if you have duplicates. Makes no sense, I know. – questionto42 Apr 22 '22 at 19:09

score 4 · Answer 3 · answered Oct 14 '10 at 09:47

UNION implement internally two queries. 1.SELECT which will return a dataset 2.DISTINCT. Anyone who has studied database internals can easily understand that a DISTINCT clause is extremely costly in terms of processing.

If you are pretty sure that the resultant dataset need not have unique rows then we can skip UNION and use UNION ALL instead.

UNION ALL will be same as UNION except that it doesn't fire a DISTINCT internally sparing us costly operations

score 3 · Answer 4 · answered Sep 02 '10 at 14:28

3

It is better to use UNION ALL when you know you want all the result rows, whether or not you know they'll be distinct or not. UNION without "all" will always perform the "distinct check", regardless of what the data actually is.

answered Sep 02 '10 at 14:28

Philip Kelley

39,426
11
57
92

score 3 · Answer 5 · answered Sep 04 '10 at 20:25

3

Why is UNION ALL faster? Because UNION must do a sort to remove the duplicates. If you do not need to remove duplicates then UNION ALL is the better option, however UNION does have a purpose and should be used when appropriate.

answered Sep 04 '10 at 20:25

Mike Ritacco

12,784
2
17
5

score 0 · Answer 6 · answered Dec 23 '22 at 18:58

I'm going to go out on a limb and suggest that it depends on your data.

If performance is measured end-to-end (from the moment the client sends the first byte of the request to the moment it gets the last byte of the response) then you have the following two extremes:

The vast minority (say 1%) of the result set contains duplicates
The vast majority (say 99%) of the result set contains duplicates

In case 1, UNION ALL will be faster simply because it does not need to sort the data (to remove duplicates) before returning it.

In case 2, UNION will be faster because it's much quicker to remove duplicates in memory than sending them over the wire. If your result set contains 1 million rows with only 2 unique values then your network time will be much smaller once those duplicates have been removed.

Performance of UNION versus UNION ALL in SQL Server

6 Answers6

Linked