23

The msdn documentation says that when we write

SELECT TOP(N) ..... ORDER BY [COLUMN]

We get top(n) rows that are sorted by column (asc or desc depending on what we choose)

But if we don't specify any order by, msdn says random as Gail Erickson pointed out here. As he points out it should be unspecified rather then random. But as Thomas Lee points out there that

When TOP is used in conjunction with the ORDER BY clause, the result set is limited to the first N number of ordered rows; otherwise, it returns the first N number of rows ramdom

So, I ran this query on a table that doesn't have any indexes, first I ran this..

SELECT *
FROM
    sys.objects so
WHERE
    so.object_id NOT IN (SELECT si.object_id
                         FROM
                             sys.index_columns si)
    AND so.type_desc = N'USER_TABLE'

And then in one of those tables, (in fact I tried the query below in all of those tables returned by above query) and I always got the same rows.

SELECT TOP (2) *
FROM
    MstConfigSettings

This always returned the same 2 rows, and same is true for all other tables returned by query 1. Now the execution plans shows 3 steps..

enter image description here

As you can see there is no index look up, it's just a pure table scan, and

enter image description here

The Top shows actual no of rows to be 2, and so does the Table Scan; Which is not the case (there I many rows).

But when I run something like

SELECT TOP (2) *
FROM
    MstConfigSettings
ORDER BY
    DefaultItemId

The execution plan shows

enter image description here

and

enter image description here

So, when I don't apply ORDER BY the steps are different (there is no sort). But the question is how does this TOP works when there is no Sort and why and how does it always gives the same result?

gotqn
  • 42,737
  • 46
  • 157
  • 243
Razort4x
  • 3,296
  • 10
  • 50
  • 88

1 Answers1

22

There is no guarantee which two rows you get. It will just be the first two retrieved from the table scan.

The TOP iterator in the execution plan will stop requesting rows once two have been returned.

Likely for a scan of a heap this will be the first two rows in allocation order but this is not guaranteed. For example SQL Server might use the advanced scanning feature which means that your scan will read pages recently read from another concurrent scan.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • So if there is no guarantee, why do I always receive the same 2 rows? **Always?** – Razort4x Mar 06 '13 at 11:02
  • 10
    Because the the table scan always operates the same way. No guarantees doesn't mean that you aren't likely to get the same results. It just means you can't rely on any particular observed ordering and complain if it changes. The only circumstance I am aware of in which you wouldn't get the first two rows in the first allocated page would be if advanced scanning kicked in. – Martin Smith Mar 06 '13 at 11:04
  • 12
    @Razort4x - I think your confusion is because of the use of the word random in your linked page. That is incorrect. SQL Server won't deliberately randomise the results it just doesn't make any particular promise about the two you get. If you truly wanted two random rows you would need to `ORDER BY NEWID()` for example. This is why the documentation was amended to say "undefined" not random. – Martin Smith Mar 06 '13 at 11:22