2

I should be able to do this but the brain appears to have melted down

Database is postgres

Table structure is simple, just four columns that matter:

Location, User, Activity, DateTime

I need to sort by location and output just the latest occurrence of each activity that has taken place. I don't know in advance what the activities are

Something like

Location    User    Activity   DateTime
London      Fred    A          08-29-2012
London      Fred    B          08-27-2012
Paris       John    A          08-29-2012
Tokyo       Fred    A          08-17-2012
Tokyo       Jane    D          08-29-2012

Thanks Folks

Eric T
  • 220
  • 1
  • 4
  • 10
  • 2
    Latest occurrence of an activity overall? Per location? Per user? Is that sample starting data, or desired results? – Clockwork-Muse Aug 29 '12 at 23:08
  • What does "sort by location" mean? – Ariel Aug 29 '12 at 23:08
  • What happens if two events happen at the same exact time? i.e. is DateTime unique? – Ariel Aug 29 '12 at 23:09
  • Wow... So many replies in such a short time! – Eric T Aug 29 '12 at 23:19
  • @EricT Most replies come immediately (or not at all). You should always leave the question open so you can reply to questions immediately, otherwise people leave and you never get good replies. – Ariel Aug 29 '12 at 23:20
  • OK Sort by location - just that - order by on the location field. Latest occurrence is of each activity recorded at a given location. Two events at the same time? very unlikely - happy to leave that to LIMIT 1 or equiv – Eric T Aug 29 '12 at 23:22
  • 1
    @EricT Sort by location before or after grouping by activity? LIMIT 1 will not work, it's MUCH MUCH more complicated than that! With all the solutions that were posted here, if you have two identical times you will get both rows. – Ariel Aug 29 '12 at 23:27
  • @Arial OK, I can live with the line duplication - I'll be running this in a python script to convert the result to JSON in response to an ajax call, So I could tidy there easily – Eric T Aug 29 '12 at 23:43

4 Answers4

4

Fastest way in PostgreSQL is probably with DISTINCT ON:

SELECT DISTINCT ON (location, activity)
       location, activity, datetime, usr
FROM   tbl
ORDER  BY location, activity, datetime DESC, usr; -- usr only to break ties

Short form with positional parameters:

SELECT DISTINCT ON (1 ,2)
       location, activity, datetime, usr
FROM   tbl
ORDER  BY 1, 2, 3 DESC, 4;

This combines sorting and reducing to distinct rows in one operation. More details, explanation and benchmark in this related answer.

user is a reserved word. Don't actually use it as column name. I substituted with usr.

If performance should be crucial an index like the following will make the difference:

CREATE INDEX tbl_multi_idx ON tbl (location, activity, datetime DESC, usr);
Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • 1
    That certainly works well, avoids the duplication but is noticeably slower to run than @arial 's approach. Elegant simple solution - As speed is not critical, I reckon that's the one I'll use – Eric T Aug 29 '12 at 23:56
  • @EricT: That's interesting, thanks. Is it still slower after you create the multi-column index I added to my answer? (Should benefit both queries.) I understand performance isn't crucial for you, but I would be very interested. Would you mind leaving a line or two with rough numbers on characteristics of your table and performance of the two queries? With and without the index? Maybe as a separate answer instead of a comment for better readability. Drop a comment here, so I get notified if you do ... – Erwin Brandstetter Aug 30 '12 at 01:04
  • Sorry @Erwin Brandstetter I'd love to help but I'm running this on a limited access appliance managed by a third party - I only have read access so can't add an index. It's working like a charm though and has saved one of our staff hours of work. – Eric T Sep 04 '12 at 05:38
2
SELECT * FROM table JOIN (
  SELECT Activity, Location, Max(DateTime) DateTime FROM table
  GROUP BY Activity, Location
  ) m USING (Activity, Location, DateTime)

Best I can answer without fully understanding your question.

Ariel
  • 25,995
  • 5
  • 59
  • 69
0
SELECT
   Location,
   User
   Activity
   DateTime
From
   myTable a
   INNER JOIN
   (SELECT Location, User, Max(DateTime) as DateTime FROM myTable GROUP BY Location, User) b
   ON a.Location = b.Location AND a.User = b.User AND a.DateTime = b.DateTime
Chains
  • 12,541
  • 8
  • 45
  • 62
0
select t1.location, t1.user, m.activity, m.datetime 
from table1 t1 inner join
(select activity, max(datetime) datetime from table1 group by activity) m
on t1.activity = m.activity and t1.datetime = m.datetime
order by t1.location asc
Nathan
  • 2,705
  • 23
  • 28