I have below referenced query which groups studies counts by teacher, study year-month, and room for the past 12 months (including current month). The result I get is correct, however, I would like to include rows with zero counts for when the data is missing.
I looked at several other related posts but could not get desired output:
- Postgres - how to return rows with 0 count for missing data?
- Postgresql group month wise with missing values
- Best way to count records by arbitrary time intervals in Rails+Postgres
Here is the query:
SELECT
upper(trim(t.full_name)) AS teacher
, date_trunc('month', s.study_dt)::date AS study_month
, r.room_code AS room
, COUNT(1) AS study_count
FROM
studies AS s
LEFT OUTER JOIN rooms AS r ON r.id = s.room_id
LEFT OUTER JOIN teacher_contacts AS tc ON tc.id = s.teacher_contact_id
LEFT OUTER JOIN teachers AS t ON t.id = tc.teacher_id
WHERE
s.study_dt BETWEEN now() - interval '13 month' AND now()
AND s.study_dt IS NOT NULL
GROUP BY
teacher
, study_month
, room
ORDER BY
teacher
, study_month
, room;
The output I get:
"teacher","study_month","room","study_count"
"DOE, JOHN","2015-07-01","A1",1
"DOE, JOHN","2015-12-01","A2",1
"DOE, JOHN","2016-01-01","B1",1
"SIMPSON, HOMER","2016-05-01","B2",3
"MOUSE, MICKEY","2015-08-01","A2",1
"MOUSE, MICKEY","2015-11-01","B1",1
"MOUSE, MICKEY","2015-11-01","B2",2
But I want count of 0 to show for all missing year-month and room combinations. For example (just first rows, there are 4 rooms in all: A1, A2, B1, B2):
"teacher","study_month","room","study_count"
"DOE, JOHN","2015-07-01","A1",1
"DOE, JOHN","2015-07-01","A2",0
"DOE, JOHN","2015-07-01","B1",0
"DOE, JOHN","2015-07-01","B2",0
...
"DOE, JOHN","2015-12-01","A1",1
"DOE, JOHN","2015-12-01","A2",0
"DOE, JOHN","2015-12-01","B1",0
"DOE, JOHN","2015-12-01","B2",0
...
To get the missing year-months, I tried left outer join on using time series and joining on time_range.year_month = study_month
, but it didn't work.
SELECT date_trunc('month', time_range)::date AS year_month
FROM generate_series(now() - interval '13 month', now() ,'1 month') AS time_range
So, I'd like to know how to 'fill in the gaps' for
a) both year-month and room and, as a bonus: b) just a year-month.
The reason for this is that the dataset would be fed to a pivot library to that we can get an output similar to following (could not do this in PG directly):
teacher,room,2015-07,...,2015-12,...,2016-07,total
"DOE, JOHN",A1,1,...,1,...,0,2
"DOE, JOHN",A2,0,...,0,...,0,0
...and so on...