Proper SQL
I want to get 3 message groups in the below order: [1,2], [3,4], [5]
To get the requested order, add ORDER BY min(id)
:
SELECT grp, user_id, array_agg(id) AS ids
FROM (
SELECT id
, user_id
, row_number() OVER (ORDER BY id) -
row_number() OVER (PARTITION BY user_id ORDER BY id) AS grp
FROM tbl
ORDER BY 1 -- for ordered arrays in result
) t
GROUP BY grp, user_id
ORDER BY min(id);
db<>fiddle here
Old sqliddle
The addition would barely warrant another answer. The more important issue is this:
Faster with PL/pgSQL
I'm using PostgreSQL and would be happy to use something specific to it, whatever would give the best performance.
Pure SQL is all nice and shiny, but a procedural server-side function is much faster for this task. While processing rows procedurally is generally slower, plpgsql wins this competition big-time, because it can make do with a single table scan and a single ORDER BY
operation:
CREATE OR REPLACE FUNCTION f_msg_groups()
RETURNS TABLE (ids int[])
LANGUAGE plpgsql AS
$func$
DECLARE
_id int;
_uid int;
_id0 int; -- id of last row
_uid0 int; -- user_id of last row
BEGIN
FOR _id, _uid IN
SELECT id, user_id FROM messages ORDER BY id
LOOP
IF _uid <> _uid0 THEN
RETURN QUERY VALUES (ids); -- output row (never happens after 1 row)
ids := ARRAY[_id]; -- start new array
ELSE
ids := ids || _id; -- add to array
END IF;
_id0 := _id;
_uid0 := _uid; -- remember last row
END LOOP;
RETURN QUERY VALUES (ids); -- output last iteration
END
$func$;
Call:
SELECT * FROM f_msg_groups();
Benchmark and links
I ran a quick test with EXPLAIN ANALYZE
on a similar real life table with 60k rows (execute several times, pick fastest result to exclude cashing effects):
SQL:
Total runtime: 1009.549 ms
Pl/pgSQL:
Total runtime: 336.971 ms
Related: