I am having a little trouble with a SQL query. I have a table to stores results against a date. I'd like to select all distinct results with the maximum (most recent) date. I can make this work with:
select distinct(r.data), max(c.committed_at) as timestamp
from results r
inner join commits c on (r.commit_id=c.id)
where r.repository_id=65 AND data_type='data_model'
group by data
order by timestamp DESC
This is good and all works.
However, I want to be able to select a third column, which is a hash. This hash is unique to every result row.
But when I add it in to the SELECT clause, I of course have to add it to my GROUP BY too.
select distinct(r.data), max(c.committed_at) as timestamp, c.hash
from results r
inner join commits c on (r.commit_id=c.id)
where r.repository_id=65 AND data_type='data_model'
group by data, c.hash
order by timestamp DESC
And this throws the query out. I no longer get distinct results (of course).
I tried an aggregate function on hash
:
select distinct(r.data), max(c.long_hash), max(c.committed_at) as timestamp
from results r
inner join commits c on (r.commit_id=c.id)
where r.repository_id=65 AND data_type='data_model'
group by data
order by timestamp DESC
But then I don't get the maximum time, I get the maximum hash.
I feel like I just want to be able to say "find the distinct results with the maximum timestamp and then just include the hash in the result", but I can't figure out how...
Cheers, Ben
Edit: my attempt at this with DISTINCT ON
is:
select distinct on (data) data, c.committed_at as committed_at, c.long_hash
from results r
inner join commits c on (r.commit_id=c.id)
where r.repository_id=65 AND data_type='complexity'
order by data, c.committed_at DESC
But this requires an ORDER BY on data
, which again, throws out the ordering of my results...
Edit 2:
Executing this query gives the following result:
'{"status":"supported","per_method":"10.2","total":"815.2","data"<snip>...}','2017-01-04 13:25:51','4a44bccca804c28c6a5e61b36b5ebcb716d4c11f'
'{"status":"supported","per_method":"8.8","total":"649.3","data":<snip>...}','2017-01-02 23:35:11','d747e657a81c5c6da4262a5298c3071082b2af41'
'{"status":"supported","per_method":"10.7","total":"944.3","data":<snip>...}','2017-01-08 17:28:57','ff4be5fa6dc88237e7855ed1b534baee69aa8800'
As you can see, the data
column is in order (according the JSONB column rules), whereas the timestamp
column is not (it goes from Jan 4th, to Jan 2nd, to Jan 8th).