I have a timeseries table in Postgres that collects events of various types, with their values and a timestamp. In stylized form it looks like this:
evt_type | evt_val_01 | evt_val_02 | evt_time |
---|---|---|---|
1 | 0.5 | 10 | 1 |
2 | 0.7 | 12 | 1 |
3 | 0.8 | 13 | 1 |
2 | 0.1 | 21 | 2 |
2 | 0.3 | 98 | 3 |
3 | 0.4 | 76 | 3 |
2 | 0.2 | 3 | 4 |
I'd like now to create a SELECT query that returns the latest (by timestamp in evt_time) values per event type (evt_typ), i.e., the query should return:
evt_type | evt_val | evt_val_02 | evt_time |
---|---|---|---|
1 | 0.5 | 10 | 1 |
2 | 0.2 | 3 | 4 |
3 | 0.4 | 76 | 3 |
In QuestDB for instance, this is easy to be achieved by the LATEST ON parameter. However, I am struggling to find an efficient Postgres approach that makes full use of index on evt_time and evt_type. The trivial approach using
select evt_type, max(evt_time) from events group by evt_type
In a sub-query, is way too slow, as the real-life table has double-digit million rows and about a thousand event types, plus about a dozen value columns. I tried to use various forms of last_value()
, and LATERAL
joins, but did not manage to get it right thus far.
Any suggestions how to structure the SELECT query would be greatly appreciated. PS, I'm on Postgres 15.1