1

I currently have a query like so -

SELECT 
   SUM(CASE month WHEN 'January' THEN 1 ELSE 0 END) AS "Jan", 
   SUM(CASE month WHEN 'February' THEN 1 ELSE 0 END) AS "Feb", 
   SUM(CASE month WHEN 'March' THEN 1 ELSE 0 END) AS "Mar", 
   SUM(CASE month WHEN 'April' THEN 1 ELSE 0 END) AS "April", 
   SUM(CASE month WHEN 'May' THEN 1 ELSE 0 END) AS "May", 
   SUM(CASE month WHEN 'June' THEN 1 ELSE 0 END) AS "June"   
FROM tbl
WHERE start >= '2021-01-01'
AND   start <= '2021-06-30'

This query needs to run for the last 6 months every month. For example, since the current month has not ended yet, this query needs to run from Jan 1 to June 30. How do I automate this query so I don't have to change the CASE statements or the dates in the WHERE clause every month.

The output I am expecting

Jan  Feb  Mar  Apr  May  June
2    2    3    6    1    4
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Aaron
  • 1,345
  • 2
  • 13
  • 32

3 Answers3

1

The column month seems to be redundant. Drop it from the table. start has all the information you need.
(I'd rather not use start as column name as that's a keyword in standard SQL - even if allowed in Postgres.)

SELECT date_trunc('month', start) AS mon, count(*) AS ct
FROM   tbl
WHERE  start >= '2021-01-01'
AND    start <  '2021-07-01'
GROUP  BY 1
ORDER  BY 1;

Use date_trunc() to preserve chronological order. If you need month names in the result:

WITH cte(current_mon) AS (SELECT date_trunc('month', LOCALTIMESTAMP))
SELECT to_char(mon, 'Mon') AS month, COALESCE(data.ct, 0) AS ct
FROM   cte c
CROSS  JOIN generate_series(c.current_mon - interval '6 mon'
                          , c.current_mon - interval '1 mon' 
                          , interval '1 mon') mon
LEFT   JOIN (
   SELECT date_trunc('month', start) AS mon, count(*)::int AS ct
   FROM   tbl, cte c
   WHERE  start >= c.current_mon - interval '6 mon'
   AND    start <  c.current_mon
   GROUP  BY 1
   ) data USING (mon)
ORDER  BY mon;

db<>fiddle here

Returns one row per month, in chronological order (considering the year, too, though it's not in your output!), and truly dynamic.

month ct
Jan 31
Feb 28
Mar 31
Apr 0
May 31
Jun 30

Note how I first build timestamps for the last six months (excluding the current one) with generate_series() in the first subquery mon. See:

Then LEFT JOIN to counts per month from the relevant time range. This way always returns the last 6 months, even if no rows are found at all. COALESCE makes the count 0 instead of NULL for that case. Related:

Note, in particular, that it's faster to aggregate first and join later. See:

Using standard English month names and 3-letter-abbreviations.

Your original query produces that information in pivoted form: one month per column. But dynamic column names are not possible for a static SQL query. If you really need that, you need a 2-step flow of operation (two round trips to the server):

  1. Build the query.
  2. Execute it.

Well, you could prepare 12 different row types (that's the range of possible result types for your case) and use a polymorphic function to achieve it. But do you actually need the pivoted form?

OK, you asked for it ...

You want a simple call like this?

SELECT * FROM f_tbl_counts_6months(NULL::m6_jul);

It's possible. Here is a proof of concept.
But, honestly, I'd rather avoid the complication and just use the simple query above.

Create a polymorphic function:

CREATE OR REPLACE FUNCTION f_tbl_counts_6months(ANYELEMENT)
  RETURNS SETOF ANYELEMENT 
  LANGUAGE plpgsql AS
$func$
DECLARE
   _current_mon timestamp := date_trunc('month', LOCALTIMESTAMP);
BEGIN
   -- to prevent incorrect column names, input row type must match current date:
   IF right(pg_typeof($1)::text, 3) = to_char(_current_mon, 'mon') THEN
      -- all good!
   ELSE
      RAISE EXCEPTION 'Current date is %. Function requires input >>%<<'
                    , CURRENT_DATE, 'NULL::m6_' || to_char(now(), 'mon');
   END IF;

   RETURN QUERY
   SELECT a[2], a[2], a[3], a[4], a[5], a[6]
   FROM (
      SELECT ARRAY(
         SELECT COALESCE(data.ct, 0)
         FROM   generate_series(_current_mon - interval '6 mon'
                              , _current_mon - interval '1 mon'
                              , interval '1 mon') mon
         LEFT   JOIN (
            SELECT date_trunc('month', start) AS mon, count(*)::int AS ct
            FROM   tbl
            GROUP  BY 1
            ) data USING (mon)
         ORDER  BY mon
         )
      ) sub(a);
END
$func$;

And 12 composite (row) types, one for each month of the year:

CREATE TYPE m6_jan AS ("Jul" int, "Aug" int, "Sep" int, "Oct" int, "Nov" int, "Dec" int);
CREATE TYPE m6_feb AS ("Aug" int, "Sep" int, "Oct" int, "Nov" int, "Dec" int, "Jan" int);
CREATE TYPE m6_mar AS ("Sep" int, "Oct" int, "Nov" int, "Dec" int, "Jan" int, "Feb" int);
CREATE TYPE m6_apr AS ("Oct" int, "Nov" int, "Dec" int, "Jan" int, "Feb" int, "Mar" int);
CREATE TYPE m6_may AS ("Nov" int, "Dec" int, "Jan" int, "Feb" int, "Mar" int, "Apr" int);
CREATE TYPE m6_jun AS ("Dec" int, "Jan" int, "Feb" int, "Mar" int, "Apr" int, "May" int);
CREATE TYPE m6_jul AS ("Jan" int, "Feb" int, "Mar" int, "Apr" int, "May" int, "Jun" int);
CREATE TYPE m6_aug AS ("Feb" int, "Mar" int, "Apr" int, "May" int, "Jun" int, "Jul" int);
CREATE TYPE m6_sep AS ("Mar" int, "Apr" int, "May" int, "Jun" int, "Jul" int, "Aug" int);
CREATE TYPE m6_oct AS ("Apr" int, "May" int, "Jun" int, "Jul" int, "Aug" int, "Sep" int);
CREATE TYPE m6_nov AS ("May" int, "Jun" int, "Jul" int, "Aug" int, "Sep" int, "Oct" int);
CREATE TYPE m6_dec AS ("Jun" int, "Jul" int, "Aug" int, "Sep" int, "Oct" int, "Nov" int);

Then the simple function call works and returns exactly your desired result:

SELECT * FROM f_tbl_counts_6months(NULL::m6_jul);
Jan Feb Mar Apr May Jun
31 28 31 0 31 30

Why? How? See:

You are required to call with the right type. I built in a fail-safe to prevent wrong results. If you call with the wrong type, like the following call in July (currently):

SELECT * FROM f_tbl_counts_6months(NULL::m6_nov);

... the function throws an exception with instructions:

ERROR:  Current date is 2021-07-15. Function requires input >>NULL::m6_jul<<
CONTEXT:  PL/pgSQL function f_tbl_counts_6months(anyelement) line 9 at RAISE
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • This does not align with the output I am expecting. I have updated the question with the updated output. – Aaron Jul 15 '21 at 20:58
  • Yes, I need the pivoted form for a report. – Aaron Jul 15 '21 at 21:04
  • I can't add 12 different CASE statements because then the report will have 12 columns. I only need columns for the last 6 months. – Aaron Jul 15 '21 at 21:07
  • @Aaron: The core question is: do you need dynamic column names? That's the tricky part. (And typically not what you want in client code anyway!) The rest is not that hard. – Erwin Brandstetter Jul 15 '21 at 21:17
  • I do need dynamic column names. If that is not an option, what would you suggest? – Aaron Jul 15 '21 at 21:24
0

Go back 1 month from the date now : the result is the end date : DATEADD(MONTH, -1, GETDATE())

7 months back to be the start date : DATEADD(MONTH, -7, GETDATE())

To calculate the first day of the month, subtract today from itself : (DAY(CURRENT_TIMESTAMP) - 1)

get end date

    sql => select DATEADD(MONTH, -1, GETDATE()) - (DAY(CURRENT_TIMESTAMP) - 1)
    postgresql => SELECT now() - INTERVAL '1 month' - (extract(day from now()) - 1 || ' day')::INTERVAL;

get start date

    sql => select DATEADD(MONTH, -7, GETDATE()) - (DAY(CURRENT_TIMESTAMP) - 1)
    postgresql => select now() - INTERVAL '7 month' - (extract(day from now()) - 1 || ' day')::INTERVAL;

query

SELECT 
  SUM(CASE month WHEN 'January' THEN 1 ELSE 0 END) AS "Jan", 
  SUM(CASE month WHEN 'February' THEN 1 ELSE 0 END) AS "Feb", 
  SUM(CASE month WHEN 'March' THEN 1 ELSE 0 END) AS "Mar", 
  SUM(CASE month WHEN 'April' THEN 1 ELSE 0 END) AS "April", 
  SUM(CASE month WHEN 'May' THEN 1 ELSE 0 END) AS "May", 
  SUM(CASE month WHEN 'June' THEN 1 ELSE 0 END) AS "June"   
FROM dateTable
WHERE start >= now() - INTERVAL '7 month' - (extract(day from now()) - 1 || ' day')::INTERVAL
AND start <= now() - INTERVAL '1 month' - (extract(day from now()) - 1 || ' day')::INTERVAL;

postgresql demo in dbfiddle

Meysam Asadi
  • 6,438
  • 3
  • 7
  • 17
-1

Try this:

DECLARE @STARTDATE AS DATE = DATEADD(d, -31, DATEADD(m, DATEDIFF(m, -1, GETDATE()) - 6, 0))
DECLARE @ENDDATE AS DATE = DATEADD(d, -1, DATEADD(m, DATEDIFF(m, -1,  GETDATE()) - 1, 0)) 

SELECT 
SUM(CASE month WHEN 'January' THEN 1 ELSE 0 END) AS "Jan", 
SUM(CASE month WHEN 'February' THEN 1 ELSE 0 END) AS "Feb", 
SUM(CASE month WHEN 'March' THEN 1 ELSE 0 END) AS "Mar", 
SUM(CASE month WHEN 'April' THEN 1 ELSE 0 END) AS "April", 
SUM(CASE month WHEN 'May' THEN 1 ELSE 0 END) AS "May", 
SUM(CASE month WHEN 'June' THEN 1 ELSE 0 END) AS "June"   
WHERE start >= @STARTDATE
AND start <= @ENDDATE