DISTINCT ON, but get the second row, not the first

Question

I'm using PostgreSQL.
I want to select 1 row for every unique id, I have this query:

SELECT DISTINCT ON (station_id) *
FROM readings
ORDER BY station_id,reading_id DESC;

I have 2 tables:

readings (content isn't important, I hope):

station_id  reading_id    temp   air_pressure
--------------------------------------------
  147          100         10       800
  148          101         20       850
  149          102         30       900
  148          103         40       950
  148          104         50       1000
  147          105         60       1050

stations (content not important I hope):

id   station_name   lat   lng
-----------------------------
147      
148     
149

So readings.station_id and station.id in stations should be the same.

My query works, but now I want to select the second newest one (2nd highest id): ignore the first row (newest reading) and select the second.

Desired result based on above sample:

station_id   reading_id   temp   air_pressure
----------------------------------------------
   148          103        40        950        
   147          100        10        800

I want the row with the second highest reading_id for every station_id (second newest reading).

Would be nice if it won't select the ones where there is only one row, but not necessary.

I'm not sure how to achieve this. I've tried so much with offset, limit etc. also tried some nested select queries ...

I want to use distinct on because I need to select second row from every unique id. But if it's not possible with distinct on, it's alright.

With `DISTINCT ON(station_id)` there will be no second row for `station_id`. To your question add example data showing what you are starting with and what you want to end up with? — Adrian Klaver, Jun 29 '22 at 23:45
There are (very) efficient solutions. Is there a table `station` with one row per relevant `station_id`, too? Also: your version of Postgres, and the exact table definition (`CREATE TABLE` statement)? And roughly how many rows per `station_id`? And what if there is just a single row per `station_id`? — Erwin Brandstetter, Jun 29 '22 at 23:59

Erwin Brandstetter · Accepted Answer · 2022-06-30T23:21:47.707

The best query heavily depends on undisclosed information.

Assuming:

current Postgres 14 (all should work since at least Postgres 9.3)
large table (?)
many rows per station_id (?)
no duplicates on (station_id, reading_id) (?)
a stations table with 1 row per station ✅
reading_idis defined NOT NULL. (?)

SELECT r.*
FROM   station s
CROSS  JOIN LATERAL (
   SELECT *
   FROM   readings r
   WHERE  r.station_id = s.station_id
   ORDER  BY reading_id DESC
   OFFSET 1
   LIMIT  1
   ) r
ORDER  BY r.station_id;  -- optional

db<>fiddle here

You get no row for stations with less than two rows.

This is very efficient for more than a few rows per station and an index on readings(station_id, reading_id DESC).

About the LATERAL join:

What is the difference between LATERAL JOIN and a subquery in PostgreSQL?

If you don't have a separate stations table, see:

How to get the second highest value per group?

If there can be duplicates (doesn't seem to be the case), and/or there are only few rows per group, this might be better:

SELECT (r).*
FROM  (
   SELECT r, dense_rank() OVER (PARTITION BY station_id
                                ORDER BY reading_id DESC) AS rnk
   FROM   readings r
   ) r
WHERE  rnk = 2;

Here, you get no row for stations with less than two distinct reading_id. But possibly more than one per station that tie for second place.

If I didn't guess right, update your question to be more specific. — Erwin Brandstetter, Jun 30 '22 at 00:26
So, i updated my question a bit. Now your code doesn't work for me, probably because im not sure what is this "r" after cross join literal ()? Shouldn't be there just nothing after )?Because there is the select statement? Im not sure, would be nice if you could updated it to my tables now. — Ovelion, Jun 30 '22 at 15:51

DISTINCT ON, but get the second row, not the first

1 Answers1

Linked