Default row order in SELECT query - SQL Server 2008 vs SQL 2012

Question

Our team recently upgraded our databases from SQL Server 2008 to SQL Server 2012. One breaking change we noticed was in the default order of rows returned by the SELECT statement, i.e. when an explicit ORDER BY clause is not specified.

As per MSDN, SQL Server 2012 does not gaurantee the order of the rows returned unless an ORDER BY clause is specified.

We have 2500+ stored procedures across 5 databases that have SELECT statements without an ORDER BY clause and it will be a sizeable effort to add the ORDER BY clause manually to match the behavior in SQL Server 2008. Is there a setting or faster way of doing this?

The other option, which hasn't been explored, is to downgrade to SQL Server 2008. How difficult would this be?

There is no such thing as the "default row order". Rows in a relational table are ***NOT*** sorted. *Unless* you specify an `order by` the DBMS is free to return them in *any* order it seems fit. Your code was broken from the start. The only sensible solution to your problem is to bite the bullet and change the code to use an `order by` You are lucky this bug hasn't shown up earlier. — , Oct 07 '14 at 12:55
Well SQL 2008 is the same in that there is no guaranteed "default" row ordering — Alex K., Oct 07 '14 at 12:56
Try using clustured and non-clustured index, if you have already a primary key in your table you dont have to worry about clustured as by default primary key is clustured index. — Chethan, Oct 07 '14 at 12:57
`ORDER BY` should only be relevant in `SELECT` queries. (unless you do really stupid things like `LIMIT 1` in subqueries, or breaking out of cursor loops) — joop, Oct 07 '14 at 13:06
@joop just stating that something is stupid does not lend any credibility to your comment. Try to explain why it is not a good choice instead of just stating it is stupid. If it is fact then follow it up with some evidence. Otherwise it is just your opinion. — Sean Lange, Oct 07 '14 at 15:58
If more than one rows in the subquery satisfy the joincondition, then either it means that there is a "partial carthesian product" taking place, or that in fact you want to pick one from the set, such as the most recent date, or the lowest id. You will have to specify explicitely that in your condition in the either case. The other cases are an aggregating subquery (the table expression has an *implicit* candidate key), and an EXISTS (anti) join, (which yields an atomic boolean expression, not a table expression, so LIMIT will certainly be not needed there either) — joop, Oct 07 '14 at 16:47

Mike D. · Accepted Answer · 2022-04-20T11:04:56.183

You need to go back and add ORDER BY clauses to your code because without them the order is never guaranteed. You were "lucky" in the past that you always got the same order but it wasn't because SQL Server 2008 guaranteed it in anyway. It most likely had to do with your indexes or how the data was being stored on the disk.

If you moved to a new host when you upgraded the difference in hardware configuration alone could have changed the way your queries execute. Not to mention the fact that the new server would have recalculated statistics on the tables and the SQL Server 2012 query optimizer probably does things a bit differently than the one in SQL Server 2008.

It is a fallacy that you can rely on the order of a result set in SQL without explicitly stating the order you want it in. SQL results NEVER have an order you can rely on without using an ORDER BY clause. SQL is built around set theory. Query results are basically sets (or multi-sets).

Itzik Ben-Gan gives a good description of set theory in relation to SQL in his book Microsoft SQL Server 2012 T-SQL Fundamentals

Set theory, which originated with the mathematician Georg Cantor, is one of the mathematical branches on which the relational model is based. Cantor's definition of a set follows:

By a "set" we mean any collection M into a whole of definite, distinct objects m (which are called the "elements" of M) of our perception or of our thought. - Joseph W. Dauben and Georg Cantor (Princeton University Press, 1990)

After a thorough explanation of the terms in the definition Itzik then goes on to say:

What Cantor's definition of a set leaves out is probably as important as what it includes. Notice that the definition doesn't mention any order among the set elements. The order in which set elements are listed is not imporant. The formal notation for listing set elements uses curly brackets: {a, b, c}. Because order has no relevance you can express the same set as {b, a, c} or {b, c, a}. Jumping ahead to the set of attributes (called columns in SQL) that make up the header of a relation (called a table in SQL), an element is supposed to be identified by name - not ordinal position. Similarly, consider the set of tuples (called rows by SQL) that make up the body of the relation; an element is identified by its key values - not by position. Many programmers have a hard time adapting to the idea that, with respect to querying tables, there is no order among the rows. In other words, a query against a table can return rows in any order unless you explicitly request that the data be sorted in a specific way, perhaps for presentation purposes.

But regardless of the academic definition of a set even the implementation in SQL server has never guaranteed any order in the results. This MSDN blog post from 2005 by a member of the query optimizer team states that you should not rely on the order from intermediate operations at all.

The reordering rules can and will violate this assumption (and do so when it is inconvenient to you, the developer ;). Please understand that when we reorder operations to find a more efficient plan, we can cause the ordering behavior to change for intermediate nodes in the tree. If you’ve put an operation in the tree that assumes a particular intermediate ordering, it can break.

This blog post by Conor Cunningham (Architect, SQL Server Core Engine) "No Seatbelt - Expecting Order without ORDER BY" is about SQL Server 2008. He has a table with 20k rows in it with a single index that appears to always return rows in the same order. Adding an ORDER BY to the query doesn't even change the execution plan, so it isn't like adding one in makes the query more expensive if the optimizer realizes it doesn't need it. But once he adds another 20k rows to the table suddenly the query plan changes and now it uses parallelism and the results are no longer ordered!

The hard part here is that there is no reasonable way for any external user to know when a plan will change . The space of all plans is huge and hurts your head to ponder. SQL Server's optimizer will change plans, even for simple queries, if enough of the parameters change. You may get lucky and not have a plan change, or you can just not think about this problem and add an ORDER BY.

If you need more convincing just read these posts:

That order is not guaranteed. Many factors can affect the order of the query results and you cannot anticipate them all. You might get lucky for awhile and maybe forever. But adding the `ORDER BY` clause won't impact performance if the query optimizer realizes that it doesn't need it. Is it worth running the risk that things will break unexpectedly just because you don't feel like putting an `ORDER BY` on your query? — Mike D., Feb 23 '17 at 11:30
I don't know what significance you see in the fact that your simple query with a table index returns the rows in the order you expect. No where have I said that it will never come out in the order you expect. Reread the last quote in my answer from one of the SQL Server Architects. Without the `ORDER BY` clause the order of your result set could change at any time for so many reasons that you can't know when it will happen. What you're saying is bad advice and should't be used it in a production system. Do not rely on indexes for ordering. If you need an ordered result set use an `ORDER BY` — Mike D., Feb 28 '17 at 12:43
@senthilprabhu It's a fact that your can't guarantee sort order without specifying an ORDER BY clause. Even if it happens 99.99% of the time, that is not 100%, it is just luck, and a simplistic test case. Please change your comment. It's very important that other people reading this answer understand this, if you rely on sort order, then you should explicitly order your results. It is not "working in SQL 2016". It doesn't work in any edition, it's entirely up to the query engine to decide order that results are returned, unless you ORDER BY. Your comments here will just confuse people. — Davos, Mar 02 '17 at 23:36
@Davos: Removed my comments. Order by clause is must for 100% accuracy. — SenthilPrabhu, Mar 07 '17 at 04:11
@Mike D.: No. There are situations where the order is guaranteed. E.g. **SELECT** value **FROM** STRING_SPLIT('Lorem ipsum dolor sit amet.', ' ') — expecting _default_ order. — tibx, Oct 28 '20 at 19:48
@tibx No. There is no default row order unless you use an order by and ordering by a table expression that simple returns a NULL field does not seem like good practice. I do not believe it will give you any sort of deterministic row order that will persist across servers and versions and hardware changes. Maybe the order when selecting from a system function will be the same but I wouldn't rely on it. I certainly wouldn't bet my job on it. It's also a fairly contrived case. How hard is it to put an ORDER BY in the query? Then nobody has to guess whether the order will be consistent or not. — Mike D., Oct 29 '20 at 19:35
@Mike D.:You are right that using ORDER BY clause is not difficult and should definitely be used in common query to achieve deterministic output. I just wanted to point out that there are some specific situations when the row order is guaranteed, such as [STRING_SPLIT](https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15) function. Unfortunately, it returns table with values only, without their order in the original input string. So ORDER BY cannot be used. However, this is needed to use ROW_NUMBER() OVER clause to get nth element. — tibx, Oct 30 '20 at 17:48

Default row order in SELECT query - SQL Server 2008 vs SQL 2012

1 Answers1

Linked

Related