7

I have a database query like:

SELECT 
  Foo,
  Foo2,
  some_calc as Bar,
  some_other_calc as Bar2,
From
 FooBar
-- some inner joins for the calcs
GROUP BY FOO
ORDER BY Bar DESC, Bar2 DESC;

I want to order by database with the order query, and then group together FOOs so that that first grouped block contains the FOO with the greatest Bar. The second grouped block of FOOs contains the seconds highest Bar, etc.

But this doesn't work as Postgres doesn't allow random grouping:

column "Bar" must appear in the GROUP BY clause or be used in an aggregate function.

How can I fix this?

Sample data and output:

╔═════╦══════════╦════╦════╦
║ FO  ║ Bar      ║  Bar 2  ║
╠═════╬══════════╬═════════╬
║  6  ║     10   ║         ║
║  4  ║     110  ║         ║
║  3  ║     120  ║         ║
║  8  ║     140  ║         ║
║  3  ║     180  ║         ║
║  3  ║     190  ║         ║
╚═════╩══════════╩════╩════╩

Output:

╔═════╦══════════╦════╦════╦
║ FO  ║ Bar      ║  Bar 2  ║
╠═════╬══════════╬═════════╬
║  3  ║     190  ║         ║
║  3  ║     180  ║         ║
║  3  ║     120  ║         ║
║  8  ║     140  ║         ║
║  4  ║     110  ║         ║
║  6  ║     10   ║         ║
╚═════╩══════════╩════╩════╩
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
DaynaJuliana
  • 1,144
  • 1
  • 14
  • 33

3 Answers3

6
SELECT foo, <some calc> AS bar, bar2
FROM   foobar
ORDER  BY max(<some calc>) OVER (PARTITION BY foo) DESC NULLS LAST  -- can't refer to bar
        , bar DESC NULLS LAST  -- but you can here
        , foo DESC NULLS LAST;

bar does not have to be a column, can be any valid expression, even an aggregate function (in combination with GROUP BY) - just not another window function, which can't be nested. Example:

You cannot, however, refer to a column alias (output column name) on the same query level within a window function. You have to spell out the expression again, or move the calculation to a subquery or CTE.
You can refer to output column names in ORDER BY and GROUP BY otherwise (but not in the WHERE or HAVING clause). Explanation:

Since it has not been defined we must expect NULL values. Typically you want NULL values last, so add NULLS LAST in descending order. See:

Assuming you want bigger foo first in case of ties with bar.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • I still get `column bar does not exist` – DaynaJuliana Apr 19 '16 at 01:32
  • 1
    If `bar` refers to an expression, you have to spell out the expression in the window function and cannot refer to a column alias of the same expression in the `SELECT` list. You can refer to the output column name (alias) in `ORDER BY` and `GROUP BY` otherwise (but not in the `WHERE` or `HAVING` clause) . Detailed explanation [here](http://stackoverflow.com/a/8119815/939860) and [here](http://stackoverflow.com/a/19849537/939860). – Erwin Brandstetter Apr 19 '16 at 01:40
  • Thanks, yes bar is an expression on two columns – DaynaJuliana Apr 19 '16 at 01:42
  • @ErwinBrandstetter could you please give me some insights on https://stackoverflow.com/questions/59818667/improve-postgressql-aggregation-query-performance – Ankur Mahajan Jan 31 '20 at 04:29
4

You just want order by. Group by reduces (in general) the number of rows by aggregation.

You can accomplish this using window functions:

SELECT Foo, Bar, Bar2,
From FooBar
ORDER BY MAX(Bar) OVER (PARTITION BY Foo) DESC,
         Foo;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Bar is really `some_calc` as bar via an inner join. This only works with columns i think – DaynaJuliana Apr 18 '16 at 22:06
  • I think this is close. May need to sort by window desc, ***bar desc,*** foo... I say this because the window bar will always be max value for each foo, and thus you also need the sort by bar. and as to @daynaJuliana if there's inner calculations you could set it all up as a CTE or subselect. – xQbert Apr 18 '16 at 23:23
1

Shouldn't this do what you're asking for?

If not, it would help if you could give some sample data in the query and show how you'd want it as the output.

SELECT 
  Foo,
  MAX(some_calc) as Bar,
  MAX(some_other_calc) as Bar2,
From
 FooBar
##some inner joins for the calcs
GROUP BY FOO;
Robins Tharakan
  • 2,209
  • 19
  • 17