The query below groups the results from first
into 4 equally spaced date bins and aggregates an average for the_value
in each bin.
WITH first as(
SELECT
extract(EPOCH FROM foo.t_date) as the_date,
foo_val as the_value
FROM bar
INNER JOIN foo
ON
foo.user_id = bar.x_id
and
foo.user_name = 'xxxx'
)
SELECT bin, round(sum(bin_sum) OVER w /sum(bin_ct) OVER w, 2) AS running_avg
FROM (
SELECT width_bucket(first.the_date
, x.min_epoch, x.max_epoch, x.bins) AS bin
, sum(first.the_value) AS bin_sum
, count(*) AS bin_ct
FROM first
, (SELECT MIN(first.the_date) AS min_epoch
, MAX(first.the_date) AS max_epoch
, 4 AS bins
FROM first
) x
GROUP BY 1
) sub
WINDOW w AS (ORDER BY bin)
ORDER BY 1;
I would like to be able to only calculate the average for the lowest say 20 the_value
's in each bin. From other posts here on Stackoverflow I have seen that this is possible and that perhaps ORDER BY the_value
and rank()
is the best way to go about it. But my struggle is that I'm not sure where my current query should be modified to implement this.
Any insight would be appreciated.
Postgres version 9.3