I'm working with a table such as this one:
create table example_table (ID Number(8), Year Number(8), Data Varchar(4));
insert into example_table
(ID,Year,Data)
(
select 1,2008,1 from dual union
select 1,2010,2 from dual union
select 2,2014,3 from dual union
select 2,2020,4 from dual union
select 2,2009,5 from dual union
select 3,2003,6 from dual union
select 4,2015,7 from dual union
select 4,2014,8 from dual);
select * from example_table;
ID | Year | Data |
---|---|---|
1 | 2008 | 1 |
1 | 2010 | 2 |
2 | 2014 | 3 |
2 | 2020 | 4 |
2 | 2009 | 5 |
3 | 2003 | 6 |
4 | 2015 | 7 |
4 | 2014 | 8 |
In this case, together the ID and Year columns form a unique identifier for a specific row. I only want to pull rows where the Year is maximized for a given ID, such as
Id | Year | Data |
---|---|---|
1 | 2010 | 2 |
2 | 2020 | 4 |
3 | 2003 | 6 |
4 | 2015 | 7 |
I know I can do this with a correlated subquery such as
select distinct
ID, Year, Data
from example_table a
where Year =
(select max(year)
from example_table b
where a.id = b.id);
Or I can do it with a common table expression:
with tmp(ID,Year)
as (
select distinct
ID,
max(year)
over (partition by ID)
from example_table)
select distinct
ID, Year, Data
from example_table
inner join tmp
on example_table.ID = tmp.ID
and example_table.year = tmp.year;
I could technically also do it by creating another table/view, but I don't have perms on my database to do that. Anyways, this is a common task I'm having to do within my scripts and I want to reduce the number of queries I generate. Is there any way to do this with only one query?
I tried using a HAVING
statement like:
select example_table.ID,
max(example_table.YEAR),
example_table.DATA
from example_table
group by ID, DATA
having example_table.YEAR = max(example_table.YEAR);
but that doesn't seem to work because HAVING
statements only function on groups, and I want to manipulate elements within the group.
Is there a way to do this with only one query, and if not, what's the most clear way to do it with two queries?