0

In SQL Server 2008 I have a behaviour I don't understand.

I'm doing a UNION between two select statements.

  • First select returns 20745 rows
  • Second select returns 0 rows

When I using union bewteen the two selects, I get 20740 rows, I would exspect 20745 as union only returns distinct values.

To get the excepted result I used union all but there is something I don't understand about it. Does anyone have an explanation?

Mureinik
  • 297,002
  • 52
  • 306
  • 350
havmaage
  • 573
  • 7
  • 25
  • 2
    This is how `UNION` works return distinct values only. Use `UNION ALL` instead and get duplicates. Always read manual first before start programming. – Lukasz Szozda Sep 04 '15 at 08:58
  • 1
    Please read this question. It will explain why this happens: http://stackoverflow.com/questions/49925/what-is-the-difference-between-union-and-union-all – Evaldas Buinauskas Sep 04 '15 at 08:59
  • 1
    How many rows do you get from your first `SELECT` if you use `SELECT DISTINCT` ? 20740 rows by any chance ?? – marc_s Sep 04 '15 at 09:10

3 Answers3

2

There must be duplicate rows in your first SELECT statement. Note that UNION eliminates duplicates from your result set.

If you want to return all rows, use UNION ALL instead.

Example:

--UNION ALL
WITH TableA(n) AS (
    SELECT * FROM (
        VALUES(1),(2),(3),(4),(1)
    )t(n)
),
TableB(n) AS (
    SELECT * FROM (
        VALUES(10),(20),(30),(40)
    )t(n)
)
SELECT n FROM TableA UNION ALL
SELECT n FROM TableB

The above will return:

n
-----------
1
2
3
4
1
10
20
30
40

While the UNION variant

SELECT n FROM TableA UNION
SELECT n FROM TableB

will return:

n
-----------
1
2
3
4
10
20
30
40
Felix Pamittan
  • 31,544
  • 7
  • 41
  • 67
2

union removes duplicate results, regardless of whether they come from two different selects or from the same one. If you want to preserve duplicated, use union all instead:

SELECT *
FROM   table1
UNION ALL
SELECT *
FROM   table2
Mureinik
  • 297,002
  • 52
  • 306
  • 350
1

First select statement has duplicates :) That's normal behavior.

Try putting a distinct in the first select statement - it should also return 20740 rows.

That should help you better understand what is happening.

Mihai Ovidiu Drăgoi
  • 1,307
  • 1
  • 10
  • 16
  • Thank you very much all of you, it turned out that there were duplicates in the select statement. I thought that union only returned distinct rows when comparing the two select, like inner join. Learned something new today thanks :-) – havmaage Sep 04 '15 at 09:53