How to group by a, b and return set of N rows of b

Question

Using Postgres 9.3.2, I want to get a count of req_status grouped by req_time and customer_id and return a set of n rows for each customer_id, even when req_status count is zero.

req_time     req_id   customer_id     req_status
----------------------------------------------- 
2014-03-19    100        1            'FAILED'
2014-03-19    102        1            'FAILED'
2014-03-19    105        1            'OK'
2014-03-19    106        2            'FAILED'
2014-03-20    107        1            'OK'
2014-03-20    108        2            'FAILED'
2014-03-20    109        2            'OK'
2014-03-20    110        1            'OK'

Output

req_time  customer_id   req_status  count
-------------------------------------------
2014-03-19    1            'FAILED'   2
2014-03-19    1            'OK'       1
2014-03-19    2            'FAILED'   1
2014-03-19    2            'OK'       0
2014-03-20    1            'FAILED'   0
2014-03-20    1            'OK'       2
2014-03-20    2            'FAILED'   1
2014-03-20    2            'OK'       1

How can I achieve this?

Erwin Brandstetter · Accepted Answer · 2023-06-25T22:36:06.263

To also see missing rows in the result, LEFT JOIN to a complete grid of possible rows. The grid is built from all possible combinations of (req_time, customer_id, req_status):

SELECT d.req_time, c.customer_id, s.req_status, count(t.req_time) AS ct
FROM  (
   SELECT generate_series (min(req_time)
                         , max(req_time)
                         , '1 day')::date
   FROM   tbl
   ) d(req_time)
CROSS  JOIN (SELECT DISTINCT customer_id FROM tbl)  c(customer_id)
CROSS  JOIN (VALUES ('FAILED'::text), ('OK'))       s(req_status)
LEFT   JOIN  tbl t USING (req_time, customer_id, req_status)
GROUP  BY 1, 2, 3
ORDER  BY 1, 2, 3;

Count on a column from the actual table, which will be 0 if no match is found (null values don't count).

Assuming req_time to be a date (not timestamp).

Similar:

array_agg group by and null

Thanks Erwin. Your query will exclude rows with count=0. Look again at how I want the output to be. — McKibet, Mar 21 '14 at 19:19

score 1 · Answer 2 · answered Mar 21 '14 at 19:16

SQL Fiddle

select
    s.req_time, s.customer_id,
    s.req_status,
    count(t.req_status is not null or null) as "count"
from
    t
    right join (
        (
            select distinct customer_id, req_time
            from t
        ) q
        cross join
        (values ('FAILED'), ('OK')) s(req_status)
    ) s on
        t.req_status = s.req_status and
        t.customer_id = s.customer_id and
        t.req_time = s.req_time
group by 1, 2, 3
order by 1, 2, 3

How to group by a, b and return set of N rows of b

2 Answers2

Linked