1

I do something like this every single day - it would be one of my most common queries:

select * from 
   (select columns, rownumber() over (partition by something) as number from whatever) x
where x = 1

Which is horrible and seems like there's a lot of extra stuff that shouldn't be needed - a subselect, an alias for the sub select, another select statement - and at best you end up with an extra column in your output - at worst you have to re-list a bunch of columns.

Obviously, what I want to do is not valid SQL because of the order of operations:

select * from whatever where rownumber() over (partition by something) = 1

And while this looks like it could work, it actually gives a very wrong answer:

select * from whatever,
    lateral (select rownumber() over (partition by something) as number)
where number = 1

first_value() seems like it should get you there - but you still end up with lots of rows:

select first_value(key) over (partition by something), columns from whatever

But I keep thinking that there's a way of achieving it and wasting a lot of time trying.

So, the question, is it possible to achieve it without using a subselect?

Essentially, I want only one of every item matching a single condition, eg., the first item of every order each month, the biggest item in each set, a single test example of each of these different cases etc.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Darren Oakey
  • 2,894
  • 3
  • 29
  • 55

1 Answers1

2

A textbook example for DISTINCT ON!

SELECT DISTINCT ON (something) columns FROM whatever;

Without ORDER BY, you get an arbitrary pick from each set with the same something. Add ORDER BY something, something_else to decide which row to pick.

See (with ample details):

And it's faster, too.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228