0

I am working on migrating a report from MySQL to Postgres and I am trying to get the latest records per each category group by Year and Month, in MySQL it looks like this:

 select Category,
        max(DATECOL) AS Date
 from Table
 group by Category, date_format(DATECOL,'%Y-%m')
 order by DATECOL desc;

+----------+------------+
| Category | Date       |
+----------+------------+
| A        | 2021-05-27 |
+----------+------------+
| B        | 2021-05-27 |
+----------+------------+
| A        | 2021-04-30 |
+----------+------------+
| B        | 2021-04-30 |
+----------+------------+
| A        | 2021-03-31 |
+----------+------------+
| B        | 2021-03-31 |
+----------+------------+ 

But When I try the following in Postgres it gives me a "Must include DATECOL in GROUP BY" error message and when I include DATECOL it just returns every possible dates. Is there a way to get the max records per category in Postgres? . Here is what I had tried in Postgres which returns the "Must include DATECOL in GROUP BY" error

 select Category,
        max(DATECOL) AS DATE
 from Table
 group by Category, concat(EXTRACT(year from DATECOL),'-', EXTRACT(month from DATECOL) )
 order by DATECOL desc;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Sidhu177
  • 457
  • 1
  • 6
  • 13
  • 2
    No that did not work in MySQL as it simply returned random results rather than rejecting the invalid query. –  May 28 '21 at 21:40

2 Answers2

3

To get the latest date per category and month, simply group by both. Use to_char() to format any way you like:

SELECT category
     , to_char(datecol, 'YYYY-MM') AS mon
     , max(datecol) AS max_date
FROM   tbl
GROUP  BY 2, 1
ORDER  BY 2 DESC, 1;

mon does not have to be in the SELECT list. But then you cannot use ordinal positions as shorthand, of course:

SELECT category
     , max(datecol) AS max_date
FROM   tbl
GROUP  BY to_char(datecol, 'YYYY-MM'), category
ORDER  BY to_char(datecol, 'YYYY-MM') DESC, category;

Downside: formatted text may not sort right (YYYY-MM is typically no problem, of course). And for big tables, date_trunc() is a bit cheaper. And since we are not even displaying it:

SELECT category
     , max(datecol) AS max_date
FROM   tbl
GROUP  BY date_trunc('month', datecol), 1
ORDER  BY date_trunc('month', datecol) DESC, 1;

To display and format any way you like:

SELECT category
     , to_char(date_trunc('month', datecol), 'YYYY-MM') AS mon
     , max(datecol) AS max_date
FROM   tbl
GROUP  BY date_trunc('month', datecol), category
ORDER  BY date_trunc('month', datecol) DESC, category;

You have to repeat the GROUP BY expression (unless you push it into a subquery), even if that makes no difference for to_char().

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

One way to solve , is to use window functions: Postgres you

select category, 
       DateCol 
from (
  select category, 
         DateCol,
         row_number() over (partition by category, date_trunc('month', DateCol ) 
                            order by DateCol desc) as rn
  from table
) t
where rn = 1;
Sidhu177
  • 457
  • 1
  • 6
  • 13