Determine a target record's offset in a Postgres query

Question

Problem: I'm building a RESTful API on top of Postgres that includes the following parameters:

An identifier <id> for a particular record in a table
Filter and sort parameters (optional)
count or page size (optional)

For example: /presidents/16?filter=!isAlive&sort=lastName,givenNames&count=5

The API returns count (or possibly fewer) records that include the record specified by <id> and the offset and count of records returned.

In the example above, the result might look like:

{
  "count": 5,
  "offset": 20,
  "records": [
    { "id": 17, "givenNames": "Andrew",    "lastName": "Johnson", "isAlive": false },
    { "id": 36, "givenNames": "Lyndon B.", "lastName": "Johnson", "isAlive": false },
    { "id": 35, "givenNames": "John F.",   "lastName": "Kennedy", "isAlive": false },
    { "id": 16, "givenNames": "Abraham",   "lastName": "Lincoln", "isAlive": false },
    { "id": 4,  "givenNames": "James",     "lastName": "Madison", "isAlive": false }
  ]
}

Current Solution: My current approach is to make three calls in serial (composing them into a single, nested query, but efficiency question remains) and I'm looking for a better approach.

Get the record referred to by target <id>, used in query #2.
- e.g. select * from presidents where id = 16
Get the offset of the target <id>
- e.g. select count(*) from presidents where lastName < 'Lincoln' and givenNames < 'Abraham' and not isAlive order by lastName, givenNames, id
After computing the appropriate offset and limit, using the passed (or default) count and the count(*) from #2, retrieve the page of records.
- e.g. select * from presidents where not isAlive order by lastName, givenNames, id offset 20 limit 5

Updated SQL

I took what @ErwinBrandstetter provided in his answer below, which was close to what I was looking for in way of a monster statement, and changed it into this:

WITH prez AS (SELECT lastName, givenNames, id, isAlive FROM presidents WHERE not isAlive),
cte AS (SELECT * FROM prez WHERE id = 16),
start AS (SELECT (COUNT(*)/5)*5 as "offset" FROM prez WHERE (lastName, givenNames, id, isAlive) < (TABLE cte))
SELECT row_to_json(sub2) AS js
FROM  (
 SELECT (SELECT * FROM start) AS "offset"
       , count(*) AS "count"
       , array_agg(sub1) AS records
 FROM  (
    SELECT * from prez
    ORDER  BY lastName, givenNames, id
    OFFSET (SELECT * from start) LIMIT  5
    ) sub1
) sub2;

SQL Fiddle

Is there an easier way in Postgres to determine the the offset of a given record for a given query?

Related questions:

Please add some explanation for *how* you determine the count in #2: `lastName < 'Lincoln' and givenNames < 'Abraham' and id < 16` - What's that supposed to implement? Doesn't seem to make sense - or I don't get it. — Erwin Brandstetter, Jan 21 '15 at 17:38
@ErwinBrandstetter OP may yet weigh in, but my intuition is that since the `presidents` table is getting ordered by ascending `id`, etc. in the query, OP is computing "offset" by counting how many records were returned with smaller values for `id` than the record of interest (16 in his example). — rchang, Jan 21 '15 at 18:34
@rchang: I figured as much an built a solution assuming that. The query #2 is most probably *wrong* in the original: Spig wants the count of rows according to the sort order: `order by lastName, givenNames, id` so the *row type* has to be compared not AND-ed expressions on individual columns. — Erwin Brandstetter, Jan 21 '15 at 18:46
Query #2's intent is to find `16`'s offset by finding all rows that come before it given the filter and sort parameters. I erroneously added a condition against `id` which I've now removed. — Spig, Jan 22 '15 at 15:27
@ErwinBrandstetter The SQL I provided is more pseudo-SQL and an attempt to demonstrate what I'm trying to do (get the count of all rows "less" than the target). My main concern was if there was an easier way to do this in Postgres instead of my approach. — Spig, Jan 22 '15 at 16:46

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

The one-stop shop you are looking for:

WITH cte AS (SELECT lastName, givenNames, id AS x FROM presidents WHERE id = 16)
SELECT row_to_json(sub2) AS js
FROM  (
   SELECT (SELECT count(*) FROM presidents
           WHERE (lastName, givenNames, id) < (TABLE cte)) AS "offset"
         , count(*) AS "count"
         , array_agg(sub1) AS records
   FROM  (
      SELECT id, givenNames, lastName, isAlive
      FROM   presidents
      WHERE (lastName, givenNames, id) >= (TABLE cte)
      ORDER  BY lastName, givenNames, id
      LIMIT  5
      ) sub1
  ) sub2;

SQL Fiddle.

Your original query #2 was incorrect:

SELECT * FROM presidents
WHERE lastName < 'Lincoln'
AND   givenNames < 'Abraham'
AND   id < 16 ...

To preserve your sort order it must be:

SELECT * FROM presidents
WHERE (lastName, givenNames, id) < ('Lincoln', 'Abraham', 16) ...

Comparing row types, not AND-ing expressions on individual columns, which would yield a completely different result. That's how ORDER BY effectively operates.

You should have a multicolumn index on (lastName, givenNames, id) in addition to the PRIMARY KEY on id to make this fast.

Variant with `row_number()` in a CTE

Based on your updated requirements.

WITH prez AS (
   SELECT lastName, givenNames, id, isAlive
        , row_number() OVER (ORDER BY lastName, givenNames, id) AS rn
   FROM   presidents
   WHERE  NOT isAlive
   )
, x AS (SELECT ((rn-1)/5)*5 AS "offset" FROM prez WHERE id = 16)
SELECT row_to_json(sub2) AS js
FROM  (
   SELECT (SELECT "offset" FROM x)
        , count(*) AS "count"
        , array_agg(sub1) AS records
   FROM  (
      SELECT lastName, givenNames, id, isAlive
      FROM   prez
      WHERE  rn > (SELECT "offset" FROM x)
      ORDER  BY rn
      LIMIT  5
      ) sub1
   ) sub2;

SQL Fiddle.

Test performance with EXPLAIN ANALYZE.

Awesome one-stop-shop query! I've modified it and put it in my question. My question is more of "does Postgres support some what of doing this query short of the one-stop-shop method?" — Spig, Jan 22 '15 at 19:26
Thanks for the details one-stop-shop, monster query. I was hoping that I didn't have to do the nested queries and until someone else answers with one, I'll go ahead and mark this as the answer. — Spig, Jan 23 '15 at 19:56

Determine a target record's offset in a Postgres query

1 Answers1

Variant with row_number() in a CTE

Variant with `row_number()` in a CTE