2

I came across an old script that in essence does the following:

CREATE TABLE #T (ColA VARCHAR (20), ID INT)
INSERT INTO #T VALUES ('BBBBBBBB', 1), ('AAAAAAA', 4), ('RRRRRR', 3)

CREATE TABLE #S (ColA VARCHAR (100), ID INT)
INSERT INTO #S
SELECT * FROM #T 
ORDER BY ID -- odd to do an order by in an insert statement, but that's the code as it is...


SELECT * FROM #S

DROP TABLE #T, #S

First, I want to mention that I am aware of the fact that tables such as the ones I created here do not have an actual order, we just order the resultset if we want.

However, if you run the script above on a SQL version 2008, you will get the results ordered in the order that was specified in the insert statement. On a 2016 machine, this is not the case. There it returns the rows in the order they were created in the first place. Does anyone know what changes cause this different behaviour?

Thanks a lot!

SQL_M
  • 2,455
  • 2
  • 16
  • 30
  • 3
    You already know what caused the "change in behaviour". Order is never guaranteed for a select without an `ORDER BY`. Any "patterns" you're seeing in the results running on either server are just *coincidence*. Don't try programming by coincidence. – Damien_The_Unbeliever Oct 23 '18 at 05:41
  • Do you know what the odds are that you repeat this 20 times and get identical results each time? I know there are no guarantees, I get that. But this is absolutely no coincidence... – SQL_M Oct 23 '18 at 05:44
  • 1
    Yes, in the same way if, on your commute to work every morning, you spot a particular person in a red car and shortly afterwards you see another particular person in a blue car. And then one morning, there's no blue car! How could this be? Obviously, there was no guarantee that they'd always be there but you repeated you observations *multiple* times! – Damien_The_Unbeliever Oct 23 '18 at 05:46
  • 1
    Ok, crazy and unfitting as that example may be, it did make me smile. :) – SQL_M Oct 23 '18 at 05:47
  • Just as with the commuting example though, there can be a myriad of reasons why things change. Just as an example on a single server, with no patching, hardware changes, data changes, etc, there's a thing called a carousel scan. I'd imagine your queries are being satisfied by a table scan (check the execution plans to confirm). So you get the results in "table order" (whatever that may be). If the optimizer spots that another query is already running a table scan, it can "piggy back" on that scan in progress, sharing results, and then starts a second scan to get back to where it joined the 1st – Damien_The_Unbeliever Oct 23 '18 at 05:52
  • Just the presence of *that other query running* means that you don't get the results in the same order as you would have done without it. – Damien_The_Unbeliever Oct 23 '18 at 05:53
  • Thanks for the info on a carousel scan, I hadn't seen that before in any documentation. – SQL_M Oct 23 '18 at 06:05

3 Answers3

2

As to your example - nothing is changed. The relation in the relation theory is represented in the SQL with a table. And the relation is not ordered. So, you are not allowed to defined how rows are ordered when they are materialized - and you should not care about this.

If you want to SELECT the data in a ordered way each time, you must specified unique order by criteria.

Also, in your example - you can SELECT the data one billion times and the data can be returned as "you inserted" it each time, but on the very next time you can get different results. The engine returns the data in the "best" way according to it when there is no order specified, but this can change anytime.

gotqn
  • 42,737
  • 46
  • 157
  • 243
  • Thanks for the feedback. However, It can not be a coincidence that every time I run this on a 2016 box I get the results in the order they were created and on 2008 I always get the results in the order that was specified in the insert statement. So there is definately something different. – SQL_M Oct 23 '18 at 05:42
  • I remembered few years ago, when I start asking the same questions :-) Basically, there are a lot said already on this topic - you can check this answer, too - https://stackoverflow.com/a/34624938/1080354 – gotqn Oct 23 '18 at 05:52
1

As you know - unless order by is specified, the database engine returns the rows in an arbitrary order - How this order is generated has to do with the internal parts of the database engine - the algorithm may change between versions, even between service packs, without any need for documentation since it's known to be arbitrary.
Please note that arbitrary is not the same as random - meaning you should not expect to get different row order each time you run the query - in fact, you will probably get the same row order every time until something changes - that might be a restart to the server, a rebuild of an index, another row added to the table, an index created or removed - I can't say because it's not documented anywhere.

Moreover, unless you have an Identity column in your table, the optimizer will simply ignore the order by clause in the insert...select statement, exactly because what you already wrote in your question - Database tables have no intrinsic order.

Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
  • I find this answer to be the most concrete of all, and have accepted it. Thanks for the input. – SQL_M Oct 23 '18 at 07:43
0
  1. Order the result set of a query by the specified column list and, optionally, limit the rows returned to a specified range. The order in which rows are returned in a result set are not guaranteed unless an ORDER BY clause is specified.

MSSQL Docs

Zaynul Abadin Tuhin
  • 31,407
  • 5
  • 33
  • 63