In Postgres, width_bucket()
is just the thing you are looking for - to granulate any number of rows (N
) in the underlying table into a given (preferably smaller) number of data points (n
). You might add the count of rows contributing to each data point to signify weight.
One minor obstacle: The variant of width_bucket()
we need operates on double precision
or numeric
numbers, not on timestamp
et al. Just extract the epoch to work with.
Assuming this table definition and a current Postgres version:
CREATE TABLE tbl (
tbl_id serial PRIMARY KEY
, value numeric NOT NULL
, created_at timestamptz NOT NULL
);
Query:
SELECT width_bucket(extract(epoch FROM t.created_at), x.min_epoch, x.max_epoch, 400) AS pix
, round(avg(t.value), 2) AS avg -- round is optional
, count(*) AS weight
FROM big t
CROSS JOIN (SELECT extract(epoch FROM min(created_at)) AS min_epoch
, extract(epoch FROM max(created_at)) AS max_epoch FROM big) x
GROUP BY 1
ORDER BY 1;
Result:
pix | avg | weight
----+--------+------
1 | 152.58 | 7
2 | 155.16 | 8
3 | 148.89 | 7
...
Returns 400 rows - unless N
< n
, in which case you get N
rows.
Related: