1

How do you select the top 5 entries from an ordered set of results without the usage of LIMIT/ROWNUM/TOP depending on the DBMS?

ene.andrei
  • 187
  • 1
  • 2
  • 9
  • Why not use `Limit`, `Top`, ...? – Richard Sep 19 '11 at 09:00
  • @Richard - because not all flavours of DBMS support that syntax: I think that is the point of the question. – APC Sep 19 '11 at 09:02
  • @eneAndrei - just to be clear, are you looking for a standard SQL solution for `top N` queries which can be run on any DBMS? – APC Sep 19 '11 at 09:04
  • That will be great. However, the reason I've asked this is because I've seen this sort of question come up many times during collage exams. – ene.andrei Sep 19 '11 at 09:24
  • As far as I know, there is no good solution to this problem. The nearest is to open a cursor "order by", and fetch the first N. – vmatyi Sep 19 '11 at 10:00

1 Answers1

9

Standard ANSI SQL solution:

SELECT *
FROM ( 
    SELECT col1, 
           col2,
           row_number() over (order by some_col) as rn
    FROM the_table
) t
WHERE rn <= 5

Works in Oracle, PostgreSQL, DB2, SQL Server, Sybase, Teradata and the upcoming Firebird 3.0 but not in MySQL as it still doesn't support windowing functions.

  • It IS the answer to the question, but would not suggest it. Can generate a full table scan, even for indexed `some_col`. – vmatyi Sep 19 '11 at 09:54
  • 2
    @vmatyi: any statement *can* generate a full table scan. That depends completely on the amount of data, the data distribution and the optimizer –  Sep 19 '11 at 10:34
  • 1
    Ok. In an oracle 9, your answer DO generate a full scan on an indexed column, on millions of rows which is quite ridiculous for a top N select. I dont know about mssql or db2 but in oracle, windowing functions are implemented in a really lazy way. (Sorry, I've left out "in oracle" from the orig commemt) – vmatyi Sep 21 '11 at 01:14
  • @vmatyi: As an ORDER BY required to identify the "first" rows, I don't see a way how this could be done without a full table scan to find all values to be sorted. An index on some_col will most probably be used to do the sorting but given the fact that all rows need to be read to do the sorting a FTS is probably more efficient. I would be interested in any solution that will not do a full table scan! –  Sep 21 '11 at 06:51
  • 1
    How do you mean a top N would require a Full Table Scan??? What are you using for the indices then? An index is nothing else, but an ordered tree of the column values + the ROWID of the containing data record. An index range scan on the tree from one side will give you the bottom N recods, a scan from the other will give the top records. In no more than N*logX steps. (where N is the number of required records, X is the table size, which together is far more less than X for large tables) – vmatyi Nov 08 '11 at 22:48