Is there a way in Postgresql to write a query which groups rows based on a column with a limit without discarding additional rows.
Say I've got a table with three columns id, color, score
with the following rows
1 red 10.0
2 red 7.0
3 red 3.0
4 blue 5.0
5 green 4.0
6 blue 2.0
7 blue 1.0
I can get a grouping based on color with window functions with the following query
SELECT * FROM (
SELECT id, color, score, rank()
OVER (PARTITION BY color ORDER BY score DESC)
FROM grouping_test
) AS foo WHERE rank <= 2;
with the result
id | color | score | rank
----+-------+-------+------
4 | blue | 5.0 | 1
6 | blue | 2.0 | 2
5 | green | 4.0 | 1
1 | red | 10.0 | 1
2 | red | 7.0 | 2
which discards item with ranks > 2. However what I need is a result like
1 red 10.0
2 red 7.0
4 blue 5.0
6 blue 2.0
5 green 4.0
3 red 3.0
7 blue 1.0
With no discarded rows.
Edit: To be more precise about the logic I need:
- Get me the row with the highest score
- The next row with the same color and the highest possible score
- The item with the highest score of the remaining items
- Same as 2., but for the row from 3.
...
Continue as long as pairs with the same color can be found, then order whats left by descending score.
The import statements for a test table can be found here. Thanks for your help.