1

I'm using PostgreSQL. I need to select the max of each group, the situation is that the table represents the products sell on each day, and I want to know the top sold product of each day.

SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as producto
      ,to_char(date_trunc('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia
FROM detalle_orden
LEFT JOIN orden ON orden.id = detalle_orden.order_id
GROUP BY orden.fecha,detalle_orden.producto_id 
ORDER BY dia,suma desc

Is returning:

suma  producto  dia
4     1         FRI
1     2         FRI
5     3         TUE
2     2         TUE

I want to get:

suma  producto  dia
4     1         FRI
5     3         TUE

Only the top product of each day (with the max(suma) of each group).

I tried different approaches, like subqueries, but the aggregate function used make things a bit difficult.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Ricardo Umpierrez
  • 768
  • 2
  • 11
  • 24
  • `Only the top product of each day.( with the max(suma) of each group)` It's a common oversight that more than one product may tie for the highest value. You need to define *exactly* what you want to happen in such cases. – Erwin Brandstetter May 10 '15 at 23:42

3 Answers3

2

You can (ab)use SELECT DISTINCT ON with the appropriate ordering clause. Assuming you made your previous query into a view:

SELECT DISTINCT ON (dia, producto) * FROM some_view ORDER BY dia, producto, suma DESC;

the DISTINCT ensures you will retain only one row for every day and product, and the ORDER BY ensures it retains the correct one

b0fh
  • 1,678
  • 12
  • 28
1

By the windowing function: RANK you can easely get it:

select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from your_query
)A
where ranking = 1

So you final query will be something like:

select * from
(
select suma,producto,dia, rank() over (partition by dia order by suma desc) as ranking
from 
(
SELECT sum(detalle_orden.cantidad) as suma,detalle_orden.producto_id as     producto,to_char(date_trunc
    ('day',orden.fecha AT TIME ZONE 'MST'),'DY') as dia FROM detalle_orden     LEFT JOIN
    orden ON orden.id= detalle_orden.order_id GROUP by
    orden.fecha,detalle_orden.producto_id ) B
) A
where ranking = 1
Houari
  • 5,326
  • 3
  • 31
  • 54
1

You can still use DISTINCT ON to get this done in a single query level without subquery, because DISTINCT is applied after GROUP BY and aggregate functions (and after window functions):

SELECT DISTINCT ON (3)
       sum(d.cantidad) AS suma
     , d.producto_id AS producto
     , to_char(o.fecha AT TIME ZONE 'MST', 'DY') AS dia
FROM   detalle_orden d
LEFT   JOIN orden o ON o.id = d.order_id
GROUP  BY o.fecha, d.producto_id 
ORDER  BY 3, 1 DESC NULLS LAST, d.producto_id;

Notes

  • This solution returns exactly one row per dia (if available). if multiple products tie for top sales my arbitrary (but deterministic and reproducible) pick is the one with the smaller producto_id.
    If you need all peers tying for one day use rank() as suggested by @Houari.

  • The sequence of events in an SQL SELECT query is explained in this related answer:

  • date_trunc() was just noise in the calculation of dia. I removed it.

  • I added NULLS LAST to the descending sort order since it is unclear whether there might be rows with NULL for suma in the result:

  • The numbers in DISTINCT ON and GROUP BY are just a syntactical shorthand notation for convenience. Similar:

    As are the added table aliases (syntactical shorthand notation).

  • Basics for DISTINCT ON

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228