1

For example I have a Product table with count, but I would like to display only the values of the top 3 products based on the sum of their count.

Product Date Value
Product 1 2022-12-01 200
Product 1 2022-12-02 200
Product 2 2022-12-01 200
Product 2 2022-12-03 500
Product 3 2022-12-04 300
Product 3 2022-12-08 600
Product 4 2022-12-01 100
Product 4 2022-12-03 100
Product 5 2022-12-01 700
Product 5 2022-12-10 800

Based on the sample above, the sum of each product would be: Product 1 - 400 Product 2 - 700 Product 3 - 900 Product 4 - 200 Product 5 - 1,500

And I would like to display only the values of the top 3 products (Products 5, 3, and 2).

Product Date Value
Product 2 2022-12-01 200
Product 2 2022-12-03 500
Product 3 2022-12-04 300
Product 3 2022-12-08 600
Product 5 2022-12-01 700
Product 5 2022-12-10 800

I used to check first the product with the highest sum of count so I could use the result as a filter on my table. But I'd like to use 1 SQL query only instead of running 2 separate queries.

SELECT product, count(value) as prod_count
FROM product
GROUP BY product
ORDER BY prod_count
LIMIT 3
Jonas Metzler
  • 4,517
  • 1
  • 5
  • 17
autodan
  • 15
  • 3

2 Answers2

1

You could use DENSE_RANK function within an aggregated query joined to your table as the following:

Select P.Product, P.Date, P.Value
From Product P Join
(
  Select Product,
         DENSE_RANK() Over (Order By Sum(Value) Desc) rn
  From Product
  Group By Product
) T
On P.Product = T.Product
Where T.rn <= 3
Order By P.Product, P.Date

See a demo.

ahmed
  • 9,071
  • 3
  • 9
  • 22
0

We can SUM (instead of COUNT) the value and GROUP BY the product.

Here we can use FETCH FIRST 3 ROWS WITH TIES to find for example two products having the identic 3rd highest sum.

So the entire query will be this one:

SELECT product, date, value 
FROM product 
WHERE product IN 
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC 
FETCH FIRST 3 ROWS WITH TIES)
ORDER BY product, date;

We should mention the column/table naming should be improved if possible because having the same table name and column name "product" causes bad readability.

Furthermore the column "date" (wich is actually a SQL key word) should better be renamed to something more meaningful like for example "sellDate", same for the column "value".

Anyway, let's assume there is another product "product 6" which has the same sumed value (700) as product 2.

Then the above query will produce this outcome:

Product Date Value
Product 2 2022-12-01 200
Product 2 2022-12-03 500
Product 3 2022-12-04 300
Product 3 2022-12-08 600
Product 5 2022-12-01 700
Product 5 2022-12-10 800
Product 6 2022-12-01 600
Product 6 2022-12-10 100

If it's not intended to show four products (or more if more have the same 3rd highest sum of value), we can just use LIMIT 3 instead:

SELECT product, date, value 
FROM product 
WHERE product IN 
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC 
LIMIT 3)
ORDER BY product, date;

So we will get only three products again, one of those having the sumed value 700 (here product 2) will not be selected.

So the result of this query would be this:

Product Date Value
Product 3 2022-12-04 300
Product 3 2022-12-08 600
Product 5 2022-12-01 700
Product 5 2022-12-10 800
Product 6 2022-12-01 600
Product 6 2022-12-10 100

Or if we even want to say product 2 should be found instead of product 6, we can add the product to the ORDER BY clause:

SELECT product, date, value 
FROM product 
WHERE product IN 
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC, product 
LIMIT 3)
ORDER BY product, date;

This will be the outcome for this query:

Product Date Value
Product 2 2022-12-01 200
Product 2 2022-12-03 500
Product 3 2022-12-04 300
Product 3 2022-12-08 600
Product 5 2022-12-01 700
Product 5 2022-12-10 800

We can try out here: db<>fiddle

Jonas Metzler
  • 4,517
  • 1
  • 5
  • 17
  • Hi, Jonas. Thank you for answering! Apologies for the bad example, but the table name and field names are actually just made-up and not the real names in our existing DB. I have tried the scripts you provided, however, I encountered errors. It might not be working well for PostgreSQL? FETCH FIRST N ROWS WITH TIES - ERROR: syntax error at or near "WITH" LIMIT N - ERROR: aggregate functions are not allowed in WHERE – autodan Dec 10 '22 at 22:36
  • It is working for Postgres, we can verify that using the fiddle link at the end of my answer. Old Postgres DB don't support this, maybe you are using a too old version? In this case, I recommend an update. Keep using such an old version is no good idea in my opinion, but anyway, here you can read more about this and a possible "workaround" for older versions: https://stackoverflow.com/questions/71074473/equivalent-for-fetch-first-with-ties-in-postgres-11-with-comparable-performance – Jonas Metzler Dec 10 '22 at 22:41
  • Hi, Jonas. The "limit" script already works. There was a mistake on my initial script. Thank you so much for the help! – autodan Dec 10 '22 at 22:46
  • You're welcome. As a last note, you could also use DENSE_RANK or ROW_NUMBER if you want to build up the fetch ties in older versions. That's written in the link I posted in the comment above, but I would like to mention this here once again because I guess the other answer will succeed to exactly do this. – Jonas Metzler Dec 10 '22 at 22:49