4

My situation is that a SQL statement which is not predictable, is given to the program and I need to do pagination on top of it. The final SQL statement would be similar to the following one:

SELECT * FROM (*Given SQL Statement*) b 
OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY;

The problem here is that the *Given SQL Statement* is unpredictable. It may or may not contain order by clause. I am not able to change the query result of this SQL Statement and I need to do pagination on it.

I searched for solution on the Internet, but all of them suggested to use an arbitrary column, like primary key, in order by clause. But it will change the original order.

shA.t
  • 16,580
  • 5
  • 54
  • 111
Code Chewer
  • 41
  • 1
  • 4
  • See http://stackoverflow.com/questions/10064532/the-order-of-a-sql-select-statement-without-order-by-clause So - you should add a ORER BY. I guess youi could just check GIVEN SQL Statment and add ORDER BY 1 to order by the first column if nmot specified – AntDC Mar 14 '17 at 10:31
  • You say "I am not able to change the query result of this SQL Statement", does that also mean you cannot change the statement before the result is obtained? You need to clarify in your question _exactly_ what can, and can't, be changed in the system to get the result you want. At what point in the data flow are you involved? Do you have access to the database or are you just operating on data flowing from one place to another? Basically, what's the system architecture? – Tony Mar 14 '17 at 10:52
  • Thank you for your reply. I can change the statement before execution without affecting the original result. The problem is the statement is unknown. Even I can change it, I am not sure the impact of changing it. There is no complicated system architecture involved in this question. I am only responsible for doing pagination on any SQL statement received by the application and output the result. – Code Chewer Mar 14 '17 at 11:05
  • If you cannot rely on deterministic ordering from the inner query, then it is not feasible to carry out pagination in the database. A more reliable alternative would be to return the whole result set to the application and allow it to paginate based on the (arbitrary) order of the results (assuming there's a front end application of some kind generating the inner query). – Ed Harper Mar 14 '17 at 11:15
  • You are right. It is no way to guarantee the ordering is deterministic, so the order by clause in inner SQL statement may be useless in this case. But the size of database is large, so it is not possible to return all the query result and do pagination later. – Code Chewer Mar 14 '17 at 11:46

5 Answers5

3

The short answer is that it can't be done, or at least can't be done properly.

The problem is that SQL Server (or any RDBMS) does not and can not guarantee the order of the records returned from a query without an order by clause.
This means that you can't use paging on such queries.
Further more, if you use an order by clause on a column that appears multiple times in your resultset, the order of the result set is still not guaranteed inside groups of values in said column - quick example:

;WITH cte (a, b)
AS
(
    SELECT 1, 'a'
    UNION ALL
    SELECT 1, 'b'
    UNION ALL
    SELECT 2, 'a'
    UNION ALL
    SELECT 2, 'b'
)

SELECT *
FROM cte 
ORDER BY a

Both result sets are valid, and you can't know in advance what will you get:

a   b
-----
1   b
1   a
2   b
2   a

a   b
-----
1   a
1   b
2   a
2   b

(and of course, you might get other sorts)

Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
1

The problem here is that the *Given SQL Statement" is unpredictable. It may or may not contain order by clause.

your inner query(unpredictable sql statement) should not contain order by,even if it contains,order is not guaranteed.

To get guaranteed order,you have to order by some column.for the results to be deterministic,the ordered column/columns should be unique

TheGameiswar
  • 27,855
  • 8
  • 56
  • 94
  • 1
    Good point. Actually, it can't contain an `order by` clause unless it also contains `top`, `offset` or `for xml`, since it's being used as a derived table. It will return an error - "The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified." – Zohar Peled Mar 14 '17 at 10:48
  • I see. I Think I got what you mean. So, instead of adding one more layer on top of the "Given SQL Statement", order by clause needs to be searched if the statement contains it. If it is not, simply add one to it. If it is, add Offset to it. Is it correct? – Code Chewer Mar 14 '17 at 11:20
  • Is there anyway to locate Order By Clause in the string of SQL statement efficiently and insert a unique column, such as primary key, to it if Order By Clause already exists? I am using C# to do the query. – Code Chewer Mar 15 '17 at 02:33
0

Please note: what I'm about to suggest is probably horribly inefficient and should really only be used to help you go back to the project leader and tell them that pagination of an unordered query should not be done. Having said that...


From your comments you say you are able to change the SQL statement before it is executed.

You could write the results of the original query to a temporary table, adding row count field to be used for subsequent pagination ordering.

Therefore any original ordering is preserved and you can now paginate.

But of course the reason for needing pagination in the first place is to avoid sending large amounts of data to the client application. Although this does prevent that, you will still be copying data to a temp table which, depending on the row size and count, could be very slow.

You also have the problem that the page size is coming from the client as part of the SQL statement. Parsing the statement to pick that out could be tricky.

Tony
  • 9,672
  • 3
  • 47
  • 75
0

As other notified using anyway without using a sorted query will not be safe, But as you know about it and search about it, I can suggest using a query like this (But not recommended as a good way)

;with cte as (
    select *,
        row_number() over (order by (select 0)) rn
    from (
        -- Your query
        ) t
)
select *
from cte
where rn between (@pageNumber-1)*@pageSize+1 and @pageNumber*@pageSize

[SQL Fiddle Demo]

shA.t
  • 16,580
  • 5
  • 54
  • 111
0

I finally found a simple way to do it without any order by on a specific column:

declare @start AS INTEGER = 1, @count AS INTEGER = 5; 

select * from (SELECT *,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS fakeCounter 
FROM (select * from mytable) AS t) AS t2 order by fakeCounter OFFSET @start ROWS 
FETCH NEXT @count ROWS ONLY

where select * from mytable can be any query

Chris Sim
  • 4,054
  • 4
  • 29
  • 36