I ran into this problem myself repeatedly and have not found the one solution.
My latest approach was to define the first row of the returned SET to be meta-data. You don't have that one in your list of approaches, yet.
The application uses the first row for bookkeeping. Actual data starts with the second row.
The obvious weakness: you have to make do with whatever row definition you have to squeeze in your meta-data. A simple total count will fit into any numeric or string type, though.
And you have to special-case the first row in the application, of course.
This simple example returns rows from a table foo
defined as
CREATE TABLE foo (
foo_id serial PRIMARY KEY
,foo text
);
Page-size is 20 rows, pre-set per default in the function header:
CREATE OR REPLACE FUNCTION f_paginate(_max_id int, _limit int = 20
, _offset int = 0)
RETURNS TABLE(foo_id int, foo text) AS
$BODY$
BEGIN
SELECT INTO foo_id count(*)::int
FROM foo f
WHERE f.foo_id < _max_id; -- get count
RETURN NEXT; -- use first row for meta-data
RETURN QUERY -- actual data starts with second row
SELECT f.foo_id, f.foo
FROM foo f
WHERE f.foo_id < _max_id
LIMIT _limit
OFFSET _offset;
END;
$BODY$
LANGUAGE plpgsql;
Call:
SELECT * FROM f_paginate(100);
Returns:
foo_id | foo
-------+----
86 | <NULL> <-- first row = meta-data
1 | bar <-- actual data
2 | baz
... 18 more ...
Obviously this method can save some bandwidth with a higher _limit (page size). With only a few rows, it's hardly worth the overhead.
The alternative approach would be your "Approach one" - add a column redundantly:
CREATE OR REPLACE FUNCTION f_paginate2(_max_id int, _limit int = 20
, _offset int = 0)
RETURNS TABLE(foo_id int, foo text, ct bigint) AS
$BODY$
BEGIN
RETURN QUERY
SELECT f.foo_id, f.foo, count(*) OVER ()
FROM foo f
WHERE f.foo_id < _max_id
LIMIT _limit
OFFSET _offset;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Call:
SELECT * FROM f_paginate2(100);
Returns:
foo_id | foo | ct
-------+-----+----
1 | bar | 86
2 | baz | 86
... 18 more ...
Performance is very similar in this simple case. The first query is slightly faster, but probably only because count(*) OVER ()
slows down the second query. A separate run with just count(*)
is faster.