-1

As part of implementing a GraphQL API I'm attempting to provide before/after pagination. For example, to return LIMIT records that come after a record with a given ID. This has to work on an arbitrary SELECT statement with arbitrary JOIN, WHERE and ORDER BY clauses already in place.

The benefit of this kind of pagination over just using page numbers is that it can more closely return an expected page of results when the underlying data is changing.

If I can get this working for after I can also make it work for before by inverting the ORDER BY clause, so here I'll focus just on the after condition.

What's the easiest or most efficient way to modify a given SELECT statement to accomplish this?

My first thought was to add an AND condition to the WHERE clause restricting results to those with column values from the ORDER BY clause that are greater than or equal to their values in the record with the given ID. But this doesn't seem to work, because there is no expectation of uniqueness in the ORDER BY clause columns, so there's no way to know where the target record will fall in the results, and therefore no way to know how to set the LIMIT to return the correct number of records.

Another approach is to first discover the offset of the target record within the initial SELECT statement, and then to add a LIMIT offset+1, limit clause to the initial SELECT statement with the discovered offset.

MySQL has no row_count() function or similar, but row numbers can be added like this:

SELECT @rownum:=@rownum+1 ‘rank’, t.* 
FROM my_table t, (SELECT @rownum:=0) r ORDER BY field2;

Then the above can be used as a subquery to fetch the rank of the target record, e.g.

SELECT rank FROM (SELECT @rownum...) WHERE id = 42

And then using that rank as the offset for the final query:

SELECT ... LIMIT (rank + 1), 100

Possibly this can be done as a single query with multiple subqueries, e.g.

SELECT ... LIMIT (SELECT rank from (SELECT @rownum...) ...) + 1, 100

But this three query approach seems like an elaborate and not very rapid way to perform a very frequently used operation, putting a higher load on our database servers than we would prefer.

Is there a better way to do this?

Edit: A specific example was requested. Say I want to get a page of 2 articles from a table of 10 articles. We'll paginate this query:

select id, title from articles order by title desc

The table:

id, title
1, "a"
3, "b"
4, "c"
6, "d"
7, "e"
8, "f"
9, "g"
10, "h"
11, "i"
12, "k"

So when requesting the page after id 6 the correct records would be 4, "c" and 3, "b". This needs to work for arbitrary WHERE and ORDER BY clauses.

Mori
  • 27,279
  • 10
  • 68
  • 73
  • Get the ID of the last displayed item, and fetch the 25 rows after that? – Andomar May 05 '18 at 15:21
  • @Andomar, the question is, how do I "fetch the 25 rows after that" for an arbitrarily sorted SELECT statement? – Mori May 05 '18 at 15:23
  • The key to solving is what you mean by *"after"*. Do you mean rows with a later date, a higher primary key, etc.? By creating a sub-query that selects the row as an "anchor" (IOW, the row that the desired rows come after) and joining to that sub-query, you can use the order by to get rows after. – Sloan Thrasher May 05 '18 at 15:46
  • Well if I understood it right, I would have the query with the "rank" and in the where section do an exists over the same "rank query". I don't know if I was clear enough. Can you add a little sample data and the results for after and previous? – Jorge Campos May 05 '18 at 15:48
  • @SloanThrasher, as this is a pagination operation, before and after are relative to the results of the arbitrary select statement, whatever order that is, wherever the target falls within that query. – Mori May 05 '18 at 15:51
  • 1
    If it's a simple pagination issue, then just repeat the previous query using the **LIMIT startval,numrow** syntax – Sloan Thrasher May 05 '18 at 15:57
  • Can you explain why a basic limit clause doesn't meet your needs? `SELECT id, title FROM articles WHERE id < 6 ORDER BY title DESC LIMIT 2` – But those new buttons though.. May 05 '18 at 16:08
  • @billynoah, it does work if I add an offset argument to LIMIT, but retrieving the offset requires running often expensive queries twice. I'm looking for a cheaper way. – Mori May 05 '18 at 16:17
  • Sorry still don't understand. The query I posted above will exactly satisfy the requirements of your example. Can you please revise your question to clarify the problem? Why is calculating the offset any more complex than `(($page - 1) * $limit)` ? – But those new buttons though.. May 05 '18 at 16:27
  • @Mori - please don't take this wrong - your rep clearly indicates that you are a seasoned SO user, but what's needed here is [a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve) – But those new buttons though.. May 05 '18 at 16:31
  • @billynoah, I agree that page based pagination is simple, as you show. But I am doing something different: you could call it reference-based pagination. I collect a `before` or `after` argument containing a reference to a specific row. I wish to retrieve the page from that point. It seems that this kind of pagination is not so simple in SQL. – Mori May 05 '18 at 17:26
  • @billynoah, I believe that my edit to the question provides an example that meets your criteria. Note that it doesn't have a page number as an input, but a record ID. – Mori May 05 '18 at 17:27
  • If the reference is to a specific row why won't my example above work? `SELECT id, title FROM articles WHERE id < 6 ORDER BY title DESC LIMIT 2`. You don't need any offset. You are specifying pages after and before the `id` with the `WHERE` clause. – But those new buttons though.. May 05 '18 at 22:11
  • This query gives you the exact output you requested - if that's not what you want, please provide a specific scenario where it won't work. Also you said *it doesn't have a page number as an input* - I don't understand. If you are dealing with pagination, you must certainly have page numbers involved. Can you tell us more about what the input is? – But those new buttons though.. May 05 '18 at 22:23
  • @billynoah, your query only works because the order of IDs is the same as the order of titles. That will usually not be the case. The input is the ID of a particular record, as stated in the question title. – Mori May 06 '18 at 00:31

2 Answers2

1

You can simply select the rows before or after you target row based on whatever you are ordering by. Assuming you have an index on id this should only require a single relatively inexpensive subquery:

SELECT id, title
FROM articles
WHERE title < (SELECT title FROM articles WHERE id = '6')
ORDER BY title DESC
LIMIT 0,2;

Next set would be

LIMIT 2,2

Previous set would require >= operator:

SELECT id, title
FROM articles
WHERE title >= (SELECT title FROM articles WHERE id = '6')
ORDER BY title DESC
LIMIT 0,2;

etc...

It would be easier to give a more succinct answer with more specific data and table structures but hope this helps.

  • This is the first approach I outlined in my question, which I had to reject because of the problem with non-distinct rows that you describe. So yes, I'm back to acquiring the offset by row counting. Thanks for looking into this. – Mori May 06 '18 at 03:02
  • One way around that is when getting preceeding rows, you can include the row with the id in question and, in theory, the dbms should return the same order for either one so in any event you should still get the result you want. If you can provide an actual dataset where you don't think this will work I'm happy to spend a bit more time on it. – But those new buttons though.. May 06 '18 at 03:03
  • @Mori do you understand my suggestion about inclusion on one side or the other? I've tested this just now and it works fine with duplicate `title` values. See updated answer... – But those new buttons though.. May 06 '18 at 03:33
  • I've changed the example dataset. I think that for this data your approach would return records with ids 1 and 3 rather than 3 and 4, as needed. – Mori May 07 '18 at 15:31
  • if you don't want to include id 6 remove the `=` symbol in the operator. you're going to need it on one side or the other though otherwise your id will always be excluded from every query. – But those new buttons though.. May 07 '18 at 19:33
0

Without any actual examples tables, the following query is a blueprint of sorts to use.

It selects the rows from a table where a column value is greater than the row in the sub-query. The sub-query selects the target row in the table that is the "starting point" for the rows desired.

In the example below, col1 is the ranking column. Change the WHERE clause in the sub-query to whatever would be needed to select the starting point row.

For pagination, alter the LIMIT clause to represent the previous pagination query.

SELECT
    col1,
    col2,
    etc
FROM table a
JOIN (
    SELECT col1
    FROM table c
    WHERE conditions
    LIMIT 20,1
    ) b
    ON a.col1 > b.col1
ORDER BY col1
LIMIT 20;
Sloan Thrasher
  • 4,953
  • 3
  • 22
  • 40
  • This is the usual and expected kind of pagination, but it's not what I'm after. I wish to pull a page based not on page number or the previous page (there may be none), but based on a particular row. – Mori May 05 '18 at 17:29
  • Then drop the LIMIT in the sub-query, and put whatever you need in *condition* to select the *"particular"* row. – Sloan Thrasher May 06 '18 at 14:06