-2

I have a simple calendar year monthly data set returned but I know I will have missing months (i.e. no data).

I found this answer here:

Best way to count records by arbitrary time intervals in Rails+Postgres

It's close but I do not understand the JOIN USING syntax.

I have my side figured out for the most part:

SELECT TO_CHAR(activity_tasks.start_date, 'MON') AS month,
EXTRACT(MONTH FROM activity_tasks.start_date) AS month_num,
SUM(cost_planned) FILTER (WHERE activity_tasks.aasm_state IN ('open', 'planned' ) )  AS planned,
SUM(cost_actual) FILTER (WHERE activity_tasks.aasm_state IN ('closed' ) ) AS actual
FROM "activity_tasks"
WHERE activity_tasks.start_date >= '2020-01-01' AND activity_tasks.start_date <= '2020-12-31'
GROUP BY month, month_num
ORDER BY month_num

Which gets me:

+-------+-----------+---------+--------+
| month | month_num | planned | actual |
+-------+-----------+---------+--------+
| NOV   | 11        | NULL    | 123    |
| DEC   | 12        | 500     | NULL   |
+-------+-----------+---------+--------+

I just want to join:

SELECT *
FROM  (
   SELECT to_char(m, 'MON') AS mmm
   FROM generate_series('2020-01-01', '2020-12-31', interval '1 month') m
) m

in to one simple expression and have a full JAN-DEC dataset including the empty months.

Bonus Side Question: is there a more elegant SQL to build the generate_series('2020-01-01', '2020-12-31', interval '1 month') array of months and set my WHERE activity_tasks.start_date >= '2020-01-01' AND activity_tasks.start_date <= '2020-12-31'? In my case the default is Jan 1 of a given year BUT there may be a case where the accounting fiscal year is some other month and looking to avoid duplicate logic here and perhaps just feed in a single start date.

Dan Tappin
  • 2,692
  • 3
  • 37
  • 77

1 Answers1

1

You can generate all starts of months with generate_series(), then bring the table with a left join:

select 
    to_char(d.start_date, 'mon') as month,
    extract(month from d.start_date) as month_num,
    sum(cost_planned) filter (where t.aasm_state in ('open', 'planned' ) )  as planned,
    sum(cost_actual)  filter (where t.aasm_state = 'closed') as actual
from generate_series('2020-01-01'::date, '2020-12-01'::date, '1 month') d(start_date)
left join activity_tasks t
    on t.start_date >= d.start_date and t.start_date < d.start_date + '1 month'::interval
group by start_date
order by start_date

You can easily change the arguments of generate_series() to accomodate a different fiscal year.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Works great . I have a side question which should have been part of the question. I have a parent table `activities` that has a `type` column. I want to join that table also and add a `where` clause to filter out specific types. etc. – Dan Tappin Dec 15 '20 at 17:45
  • I just updated my question. I removed that second join from my first question because I thought it would be trivial to add that back in. I thought that was an easy join to add back in but I am obviously missing something here as when I do I get the old results and the 'empty' months don't show up anymore. – Dan Tappin Dec 15 '20 at 18:03
  • 1
    @DanTappin: you should not be modifying the question after answer answers were posted (and accepted). If you have a new question, then you should ask *a new question*. – GMB Dec 15 '20 at 18:05
  • 1
    Thanks for the feedback - I will revert it and post a second question – Dan Tappin Dec 15 '20 at 18:06
  • FYI - new question here: https://stackoverflow.com/questions/65313068/chained-join-not-filtering-as-expected. Hopefully a easy answer for you. – Dan Tappin Dec 15 '20 at 20:47
  • 1
    Ended- up figuring it out with a sub-select for the JOIN. Your answer here cleared the way for that. Thanks. – Dan Tappin Dec 15 '20 at 23:53