17

I'm looking to select just the latest records of a table based on date, but only one one Distinct listing of each of the urls. The table structure is like this;

ID        URL          DateVisited
1         google.com   01-01-2016
2         yahoo.com    01-02-2016
3         google.com   12-30-2015
4         google.com   02-01-2016

So for my result set I would want

google.com     02-01-2016
yahoo.com      01-02-2016

I will have a couple more conditionals in my actual query, but just want to get the single latest records in a hit log, rather than list of distinct urls and dates, just distinct url's and the latest date.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
Nick G
  • 1,209
  • 8
  • 34
  • 58
  • Possible duplicate of [Fetch the row which has the Max value for a column](http://stackoverflow.com/questions/121387/fetch-the-row-which-has-the-max-value-for-a-column) – Tab Alleman Jul 21 '16 at 13:32

2 Answers2

31

This is actually pretty easy to do using simple aggregation, like so:

select URL, max(DateVisited)
from <table>
group by URL
Chitharanjan Das
  • 1,283
  • 10
  • 15
  • Worked great, I had done this, but was including the dtVisited in my group by clause giving me unwanted results, I knew it was easier than I was making it out to be! – Nick G Jul 21 '16 at 13:47
  • Worked great! Thanks! – krisDrOid Sep 24 '20 at 21:19
  • How can this be done if you also wanted the PKID for the returned rows? Is there an equally slick method? - or do you have to muck about with row_numbers and partitions as shown in Gordon Linoff's answer? – Geo... Apr 01 '21 at 20:28
  • @Geo... The trouble with returning ids, is that a URL might have multiple ids. Look at the sample data in the question. The URL `google.com` appears with 3 different IDs: 1, 3 and 4. If you're only interested in one of these, say the minimum or the maximum, you could tack on another aggregate function to the query, like `min(id)` or `max(id)`. Some SQL databases even support aggregate functions that will return arrays and comma-separated lists of all ids. Check out PostgreSQL's `array_agg()` and `string_agg()`, for instance, or MySQL's `group_concat()`. Use `count()` to count the number of ids – Chitharanjan Das Apr 06 '21 at 08:41
8

This is usually done using row_number():

select t.*
from (select t.*,
             row_number() over (partition by url order by datevisited desc) as seqnum
      from t
     ) t
where seqnum = 1;

This allows you to get all the columns associated with the latest record.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786