@Gordon's answer is almost but not quite correct. There is a subtle difference between max(b.z)
and ORDER BY b.z DESC
if the column can be NULL
. Often overlooked and cause for much headache.
This can bite you, even with a column b.z
defined NOT NULL
. An outer join can introduce NULL
values for the column. Wouldn't be effective in this simple query, but can be in more complex queries.
While aggregate functions like min
, max
or count
ignore NULL
values, those have to be dealt with somehow when the same column is used in the ORDER BY
clause.
In default sort order NULL
values sort last. When the sort order is reversed with DESC
, NULL
values sort first. It must be that way to stay consistent.
Hence, you get a row with baz IS NULL
from DISTINCT ON
if any NULL
value is in the set of peers, where you would get the greatest not-null value (if any) from max(b.z)
. Most probably not what you want. You can fix it with NULLS LAST
:
Plus, while using SELECT *
, you would not want to return the join column baz
twice - guaranteed to be identical. The USING
clause comes in handy, it only returns the column once.
So:
SELECT DISTINCT ON (baz) *
FROM foo f
JOIN bar b USING (baz)
WHERE uid1 = 120 -- where from?
ORDER BY baz, b.z DESC NULLS LAST;
NULLS LAST
won't hurt, even if there are no NULL
values - unless you actually want NULL
values to prevail.
Detailed explanation for DISTINCT ON
:
Faster
Since uid1
is from baz
(as commented), this query is typically faster for big tables:
SELECT *
FROM foo f
JOIN (
SELECT DISTINCT ON (baz) *
FROM bar
WHERE uid1 = 120
ORDER BY baz, z DESC NULLS LAST
) b USING (baz);
Depending on table definition and data distribution, there may be even faster query techniques: