10

I have only used SQL rarely until recently when I began using it daily. I notice that if no "order by" clause is used:

  1. When selecting part of a table the rows returned appear to be in the same order as they appear if I select the whole table
  2. The order of rows returned by a selecting from a join seemes to be determined by the left most member of a join.

Is this behaviour a standard thing one can count on in the most common databases (MySql, Oracle, PostgreSQL, Sqlite, Sql Server)? (I don't really even know whether one can truly count on it in sqlite). How strictly is it honored if so (e.g. if one uses "group by" would the individual groups each have that ordering)?

  • Possible duplicate of [What is MySQL row order for "SELECT \* FROM table\_name;"?](https://stackoverflow.com/questions/1949641/what-is-mysql-row-order-for-select-from-table-name) – philipxy Jun 05 '19 at 03:26

5 Answers5

14

If no ORDER BY clause is included in the query, the returned order of rows is undefined.

Whilst some RDBMSes will return rows in specific orders in some situations even when an ORDER BY clause is omitted, such behaviour should never be relied upon.

Phil
  • 2,392
  • 18
  • 21
13

Section 20.2 <direct select statement: multiple rows>, subsection "General Rules" of the SQL-92 specification:

4) If an <order by clause> is not specified, then the ordering of
   the rows of Q is implementation-dependent.
eggyal
  • 122,705
  • 18
  • 212
  • 237
  • That only covers the issue for *cursors*, but the same holds true for query results not obtained through a cursor. – kgrittn May 02 '12 at 17:48
10

If you want order, include an ORDER BY. If you don't include an ORDER BY, you're telling SQL Server:

I don't care what order you return the rows, just return the rows

Since you don't care, SQL Server is going to decide how to return the rows what it deems will be the most efficient manner possible right now (or according to the last time the plan for this specific query was cached). Therefore you should not rely on the behavior you observe. It can change from one run of the query to the next, with data changes, statistics changes, index changes, service packs, cumulative updates, upgrades, etc. etc. etc.

Aaron Bertrand
  • 272,866
  • 37
  • 466
  • 490
4

For PostgreSQL, if you omit the ORDER BY clause you could run the exact same query 100 times while the database is not being modified, and get one run in the middle in a different order than the others. In fact, each run could be in a different order.

One reason this could happen is that if the plan chosen involves a sequential scan of a table's heap, and there is already a seqscan of that table's heap in process, your query will start it's scan at whatever point the other scan is already at, to reduce the need for disk access.

As other answers have pointed out, if you want the data in a certain order, specify that order. PostgreSQL will take the requested order into consideration in choosing a plan, and may use an index that provides data in that order, if that works out to be cheaper than getting the rows some other way and then sorting them.

GROUP BY provides no guarantee of order; PostgreSQL might sort the data to do the grouping, or it might use a hash table and return the rows in order of the number generated by the hashing algorithm (i.e., pretty random). And that might change from one run to the next.

kgrittn
  • 18,113
  • 3
  • 39
  • 47
-1

It never ceased to amaze me when I was a DBA that this feature of SQL was so often thought of as quirky. Consider a simple program that runs against a text file and produces some output. If the program never changes, and the data never changes, you'd expect the output to never change.

As for this:

If no ORDER BY clause is included in the query, the returned order of rows is undefined.

Not strictly true - on every RDBMS I've ever worked on (Oracle, Informix, SQL Server, DB2 to name a few) a DISTINCT clause also has the same effect as an ORDER BY as finding unique values involves a sort by definition.

EDIT (6/2/14):

Create a simple table

enter image description here

For DISTINCT and ORDER BY, both the plan and the cost is the same since it is ostensibly the same operation to be performed

enter image description here

enter image description here

And not surprisingly, the effect is thus the same

enter image description here

Robbie Dee
  • 1,939
  • 16
  • 43
  • In PostgreSQL it *sometimes* does; depending on a number of details about the query and your optimizer cost factors it may use a sort to find unique values and return rows in sort order. Other times it may find that a hash table will be faster, and the unique values will be in the order of their hash values, unless you have specified an `ORDER BY` clause. I doubt that PostgreSQL is the only product to use hash tables when they are faster, or to return the results in the fastest manner if no order is requested. – kgrittn Oct 17 '12 at 18:17
  • SQL Server can use hashing to determine distinct items too. Just proves the point that without an `ORDER BY` you *cannot* rely on any `ORDER`. – Martin Smith Jan 20 '14 at 10:21
  • I never said it could (or should) be relied upon. Simply that a dedupe will by definition produce a sorted list if *values* are used. Clearly a hash would produce different sort results. – Robbie Dee Jan 20 '14 at 15:17
  • Your answer clearly says "finding unique values involves a sort by definition.". This is false. e.g. testing on SQL Server `CREATE TABLE T(X INT);INSERT INTO T SELECT number%2 FROM master..spt_values;SELECT DISTINCT X FROM T` execution plan is [like this](http://i.stack.imgur.com/neAEY.png). No sort operation. Output may or may not be ordered, was `0,1,-1` for me. – Martin Smith Jan 22 '14 at 07:56
  • Are you serious? Look at the execution plan - it is using a *hash*. The basic premise that the order of rows without an ORDER BY is undefined is nonsense. It just depends on the implementation. Look at the SQL specification in the accepted answer. – Robbie Dee Jan 22 '14 at 13:28
  • And incidentally, Hash Match (Aggregate) *is* a sort. It just happens to be a sort on the hashed values. – Robbie Dee Jan 22 '14 at 13:51
  • Yes I know it uses a hash. That was deliberate because your answer seems to imply that `SELECT DISTINCT X` is the same as `ORDER BY X` and this shows it is not the case. My [answer here](http://stackoverflow.com/questions/21371176/unordered-results-in-sql/21371444#21371444) gives some concrete examples where without `ORDER BY` the order of results truly is undefined and non deterministic. – Martin Smith Feb 06 '14 at 12:28
  • A DISTINCT on a column value (not its hash) WILL involve a sort. In fact, with a hash it will STILL involve a sort. The order of the results is NEVER undefined. You just shouldn't need to care about how it is implementated. My original point was that it should be no surprise at all if the order of the data never changes for the same query. Your expertise seems to be primarily SQL Server based. All of the above applies to this as it also does for Informix, Oracle and DB2. – Robbie Dee Feb 06 '14 at 12:56
  • A hash distinct, in Oracle at least, is not a sort in the sense that an order by is a sort. There is no "make this element appear before this one". I take your point that there is still some degree of determinism in the output, but "... DISTINCT clause also has the same effect as an ORDER BY" has not been the case in Oracle for many, many versions. I wish you'd remove it. – David Aldridge Feb 06 '14 at 13:42
  • Yes, I've spent an inordinate amount of time trying to explain that I'm talking about the **column values** here, not the hash. Also understand that this is a database agnostic question. If you can prove for any versions of the database listed by the OP where **column values** (not the hash) do not perform as I've described, I'll happily edit or remove my answer. Otherwise, to the best of my knowledge it holds true and will remain. – Robbie Dee Feb 06 '14 at 13:51