I have a table in the following form:
index, ingestion_id, a, b, c, d
0, '2020-04-22-1600', 0a, 0b, 0c, 0d
1, '2020-04-22-1700', 0a, 0b, 0c, 0d
2, '2020-04-22-1600', 1a, 1b, 1c, 1d
3, '2020-04-22-1700', 1a, 1b, 1c, 1d
4, '2020-04-22-1800', 1a, 1b, 1c, 1d
...
I would like extract all the rows and columns where the ingestion_id
is the highest. Thus it should return index 1 and index 4 for all rows and columns.
I found some examples, but they require that we pre-define the columns that we want to select. I don't know the columns in advance, but I do know that the table will have a column named ingestion_id
. Here is an example:
SELECT *
FROM (
SELECT MAX(ingestion_id) as ingestion_id, a, b, c, d
FROM table as t
GROUP BY a, b, c, d
ORDER BY a
)
How can I select all columns where the ingestion_id
is the highest and group by all columns except for the ingestion_id
?
BONUS
Imagine the table now having the form:
index, ingestion_id, a, b, c, d
0, '2020-04-22-1600', 0a, 0b, 0c, 0d
1, '2020-04-22-1700', 0a, 0b, 0c, 0d
2, '2020-04-22-1600', 1a, 1b, 1c, 1d
3, '2020-04-22-1700', 1a, 1b, 1c, 1d
4, '2020-04-26-1800', 2a, 2b, 2c, 2d
5, '2020-04-26-1900', 2a, 2b, 2c, 2d
...
The answer provided by Gordon Linoff (as of 2020/04/26) will in this case only filter out row 5 as its the highest ingestion_id
. We also need however row 1 and row 3 as the values (except for the column ingestion_id
) are unique in the other columns.