41

If I have a few UNION Statements as a contrived example:

SELECT * FROM xxx WHERE z = 1
UNION 
SELECT * FROM xxx WHERE z = 2
UNION
SELECT * FROM xxx WHERE z = 3

What is the default order by behaviour?

The test data I'm seeing essentially does not return the data in the order that is specified above. I.e. the data is ordered, but I wanted to know what are the rules of precedence on this.

Another thing is that in this case xxx is a View. The view joins 3 different tables together to return the results I want.

Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
Ray Booysen
  • 28,894
  • 13
  • 84
  • 111

6 Answers6

49

There is no default order.

Without an Order By clause the order returned is undefined. That means SQL Server can bring them back in any order it likes.

EDIT: Based on what I have seen, without an Order By, the order that the results come back in depends on the query plan. So if there is an index that it is using, the result may come back in that order but again there is no guarantee.

DJ.
  • 16,045
  • 3
  • 42
  • 46
  • Reason why I was asking is there was a definite order, I'm guessing based on a clustered index (A PK) at some stage of the selects – Ray Booysen Jan 07 '09 at 17:21
  • 2
    Whatever order may be returned for many sequential selects may be consistent. But you should not depend on the order having any coincidental relation to anything. Adding a table, or index, or more data, may affect the order. Don't be a lazy coder. – DrFloyd5 Jan 07 '09 at 18:02
  • 1
    AS DJ indicated, without an ORDER BY clause, it may even vary each time the query is run. – BradC Jan 07 '09 at 18:19
  • Yup, I realise this, but was looking for a definitive answer as to what the order by behaviour was. – Ray Booysen Jan 07 '09 at 18:44
  • does the same behaviour also apply to postgres or SQL in general ? – Harsh Gundecha Jan 21 '21 at 05:03
16

In regards to adding an ORDER BY clause:

This is probably elementary to most here but I thought I add this. Sometimes you don't want the results mixed, so you want the first query's results then the second and so on. To do that I just add a dummy first column and order by that. Because of possible issues with forgetting to alias a column in unions, I usually use ordinals in the order by clause, not column names.

For example:

SELECT 1, * FROM xxx WHERE z = 'abc'
UNION ALL
SELECT 2, * FROM xxx WHERE z = 'def'
UNION ALL
SELECT 3, * FROM xxx WHERE z = 'ghi'
ORDER BY 1

The dummy ordinal column is also useful for times when I'm going to run two queries and I know only one is going to return any results. Then I can just check the ordinal of the returned results. This saves me from having to do multiple database calls and most empty resultset checking.

Will Rickards
  • 2,776
  • 2
  • 19
  • 25
13

Just found the actual answer.

Because UNION removes duplicates it does a DISTINCT SORT. This is done before all the UNION statements are concatenated (check out the execution plan).

To stop a sort, do a UNION ALL and this will also not remove duplicates.

Ray Booysen
  • 28,894
  • 13
  • 84
  • 111
  • 5
    This doesn't really answer the question: is the order still undefined if you do a `union all`? (reference please) – Matt Fenwick Feb 06 '12 at 21:15
  • I'm not sure if this is the technically correct answer, but in SQLite, a `UNION ALL` preserves the order of how the user presents the queries rather than performing an ordering on the backend. This answer specifically solved my issue – StonedTensor Oct 19 '22 at 09:41
8

If you care what order the records are returned, you MUST use an order by.

If you leave it out, it may appear organized (based on the indexes chosen by the query plan), but the results you see today may NOT be the results you expect, and it could even change when the same query is run tomorrow.

Edit: Some good, specific examples: (all examples are MS SQL server)

  • Dave Pinal's blog describes how two very similar queries can show a different apparent order, because different indexes are used:

    SELECT ContactID FROM Person.Contact
    SELECT *         FROM Person.Contact
    
  • Conor Cunningham shows how the apparent order can change when the table gets larger (if the query optimizer decides to use a parallel execution plan).

  • Hugo Kornelis proves that the apparent order is not always based on primary key. Here is his follow-up post with explanation.

BradC
  • 39,306
  • 13
  • 73
  • 89
  • 1
    and if a query goes parallel you can get random chunks of data, each of which is sorted, but the chunks are out of order; and there are many corner cases (eg: page split occurs while query is running) that means that you can very occasionally get rows out of clustered index order, even if your query plan appears to give them to you in order (ie, clustered index scan with no other order affecting operations). The server is even free to give the the 10 pages of data currently in cache up front, then go back and scan the rest, skipping the pages it has already given you. – Andrew Hill Mar 31 '17 at 02:31
2

A UNION can be deceptive with respect to result set ordering because a database will sometimes use a sort method to provide the DISTINCT that is implicit in UNION , which makes it look like the rows are deliberately ordered -- this doesn't apply to UNION ALL for which there is no implicit distinct, of course.

However there are algorithms for the implicit distinct, such as Oracle's hash method in 10g+, for which no ordering will be applied.

As DJ says, always use an ORDER BY

David Aldridge
  • 51,479
  • 8
  • 68
  • 96
1

It's very common to come across poorly written code that assumes table data is returned in insert order, and 95% of the time the coder gets away with it and is never aware that this is a problem as on many common databases (MSSQL, Oracle, MySQL). It is of course a complete fallacy and should always be corrected when it's come across, and always, without exception, use an Order By clause yourself.

Cruachan
  • 15,733
  • 5
  • 59
  • 112
  • 4
    That's the point, without an order by clause the order of any query is 'undefined' as per the various SQL standards. There *is* no definitive behaviour because it is explicitly stated that **there is no deveinative behaviour**. – Cruachan Jan 09 '09 at 17:27