232

For pagination purposes, I need a run a query with the LIMIT and OFFSET clauses. But I also need a count of the number of rows that would be returned by that query without the LIMIT and OFFSET clauses.

I want to run:

SELECT * FROM table WHERE /* whatever */ ORDER BY col1 LIMIT ? OFFSET ?

And:

SELECT COUNT(*) FROM table WHERE /* whatever */

At the same time. Is there a way to do that, particularly a way that lets Postgres optimize it, so that it's faster than running both individually?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Tim
  • 4,999
  • 3
  • 24
  • 29
  • 3
    Does this answer your question? [Best way to get result count before LIMIT was applied](https://stackoverflow.com/questions/156114/best-way-to-get-result-count-before-limit-was-applied) – Marty Neal Feb 12 '20 at 23:44

4 Answers4

328

Yes. With a simple window function:

SELECT *, count(*) OVER() AS full_count
FROM   tbl
WHERE  /* whatever */
ORDER  BY col1
OFFSET ?
LIMIT  ?

Be aware that the cost will be substantially higher than without the total number, but typically still cheaper than two separate queries. Postgres has to actually count all rows either way, which imposes a cost depending on the total number of qualifying rows. Details:

However, as Dani pointed out, when OFFSET is at least as great as the number of rows returned from the base query, no rows are returned. So we also don't get full_count.

If that's not acceptable, a possible workaround to always return the full count would be with a CTE and an OUTER JOIN:

WITH cte AS (
   SELECT *
   FROM   tbl
   WHERE  /* whatever */
   )
SELECT *
FROM  (
   TABLE  cte
   ORDER  BY col1
   LIMIT  ?
   OFFSET ?
   ) sub
RIGHT  JOIN (SELECT count(*) FROM cte) c(full_count) ON true;

You get one row of NULL values with the full_count appended if OFFSET is too big. Else, it's appended to every row like in the first query.

If a row with all NULL values is a possible valid result you have to check offset >= full_count to disambiguate the origin of the empty row.

This still executes the base query only once. But it adds more overhead to the query and only pays if that's less than repeating the base query for the count.

If indexes supporting the final sort order are available, it might pay to include the ORDER BY in the CTE (redundantly).

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 5
    By both LIMIT and conditions, we have rows to be returned, but with the given offset it would return no result. In that situation, How would we be able to get the row count? – Dani Mathew Jun 05 '18 at 16:09
  • very nice, thanks, works great when you using pagination , datatables, just add this in start of your sql, and use it , save an extra query for total count. – Ahmed Sunny Aug 28 '18 at 08:02
  • Could you elaborate on this if the counting could be dynamically enabled in the query via an input parameter? I have a similar requirement but the user decides if he wants the inline count or not. – julealgon Oct 09 '18 at 03:10
  • 1
    @julealgon: Please start a *new question* with the defining details. You can always link to this one for context and add leave a comment here to link back (and get my attention) if you wish. – Erwin Brandstetter Oct 09 '18 at 03:23
  • 2
    For anyone wondering; if you also want to limit the COUNT(\*) done over the view, for example for when you have a huge table and want to prevent counting everything beyond a certain number, then you can use: COUNT(\*) OVER(ROWS BETWEEN CURRENT ROW AND 1000 FOLLOWING) where 1000 is the number where the count will stop regardless of whether your query (without the LIMIT) would return even more rows – Arthur Oct 30 '18 at 01:20
  • @Mansoor: that's the expected behavior. – Erwin Brandstetter Oct 31 '19 at 19:38
  • Erwin, Correct, My bad and sometimes I feel newbie when the workload is too much. – Mansoor Nov 02 '19 at 10:26
  • "but still cheaper than two separate queries" are you sure about that? Because in my example with 12 million records the single query takes 6 times more time than 2 separate queries. (LIMIT 10 OFFSET 0) – Kote Nov 22 '19 at 13:40
  • I'm sure it depends on the exact query, but in my case, I find with Postgres 12 that the CTE version generates a very similar query plan to the non-CTE version, and is approximately as fast. – Justin L. Mar 23 '20 at 23:03
  • 1
    @JustinL.: The added overhead should only be significant for relatively cheap base queries. Also, Postgres 12 has improved CTE performance in multiple ways. (Though this CTE is still [`MATERIALIZED`](https://www.postgresql.org/docs/current/queries-with.html) by default, being referenced twice.) – Erwin Brandstetter Mar 23 '20 at 23:40
  • Using window function i added "count (*) filter (where [the same main where]) OVER() AS ..." to reflect the count of my main filter – AnselmoMS May 09 '20 at 13:58
  • @AnselmoMS: `count (*) OVER()` is applied *after* applying all filters (`WHERE` clauses) of the main query. See https://stackoverflow.com/a/8242764/939860. The aggregate `FILTER` clause only makes sense in this position to count a subset (apply *additional* filters). – Erwin Brandstetter May 12 '20 at 02:02
  • What's the `c(full_count)` part at the end of the query? `c` appears to be an alias, but then it's invoked like a function? – torkel Jun 10 '20 at 09:36
  • @torkel: `(full_count)` is a list of *column* aliases, just one name in this case. [The manual](https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-TABLE-ALIASES): `FROM table_reference [AS] alias ( column1 [, column2 [, ...]] )` – Erwin Brandstetter Jun 10 '20 at 10:29
  • Does anyone have a reference for how the TABLE keyword is used in this context? I reviewed the sections on the "FROM clause" for select queries and table expressions. I also reviewed most of the FROM clauses I could find in the manual, I still came up empty. – user2959071 Feb 23 '21 at 15:47
  • @user2959071: About `TABLE`, see https://stackoverflow.com/a/30276023/939860 – Erwin Brandstetter Feb 23 '21 at 17:43
  • There is a way to achieve the same result, but without writing the query two times? – manzapanza Jan 11 '23 at 10:17
16

While Erwin Brandstetter's answer works like a charm, it returns the total count of rows in every row like following:

col1 - col2 - col3 - total
--------------------------
aaaa - aaaa - aaaa - count
bbbb - bbbb - bbbb - count
cccc - cccc - cccc - count

You may want to consider using an approach that returns total count only once, like the following:

total - rows
------------
count - [{col1: 'aaaa'},{col2: 'aaaa'},{col3: 'aaaa'}
         {col1: 'bbbb'},{col2: 'bbbb'},{col3: 'bbbb'}
         {col1: 'cccc'},{col2: 'cccc'},{col3: 'cccc'}]

SQL query:

SELECT
    (SELECT COUNT(*) 
     FROM table
     WHERE /* sth */
    ) as count, 
    (SELECT json_agg(t.*) FROM (
        SELECT * FROM table
        WHERE /* sth */
        ORDER BY col1
        OFFSET ?
        LIMIT ?
    ) AS t) AS rows 
treecon
  • 2,415
  • 2
  • 14
  • 28
6

edit: this answer is valid when retrieving the unfiltered table. I'll let it in case it could help someone but it might not exactly answer the initial question.

Erwin Brandstetter's answer is perfect if you need an accurate value. However, on large tables you often only need a pretty good approximation. Postgres gives you just that and it will be much faster as it will not need to evaluate each row:

SELECT *
FROM (
    SELECT *
    FROM tbl
    WHERE /* something */
    ORDER BY /* something */
    OFFSET ?
    LIMIT ?
    ) data
RIGHT JOIN (SELECT reltuples FROM pg_class WHERE relname = 'tbl') pg_count(total_count) ON true;

I'm actually quite not sure if there is an advantage to externalize the RIGHT JOIN or have it as in a standard query. It would deserve some testing.

SELECT t.*, pgc.reltuples AS total_count
FROM tbl as t
RIGHT JOIN pg_class pgc ON pgc.relname = 'tbl'
WHERE /* something */
ORDER BY /* something */
OFFSET ?
LIMIT ?
  • 6
    About fast count estimate: https://stackoverflow.com/a/7945274/939860 Like you said: valid when retrieving the whole table - which is contradicted by the `WHERE` clause in your queries. The second query it logically wrong (retrieves one row for *every* table in the DB) - and more expensive when fixed. – Erwin Brandstetter May 12 '20 at 01:55
-22

No.

There's perhaps some small gain you could theoretically gain over running them individually with enough complicated machinery under the hood. But, if you want to know how many rows match a condition you'll have to count them rather than just a LIMITed subset.

Richard Huxton
  • 21,516
  • 3
  • 39
  • 51