0

I have a table of log entries that have user_id and datetime. I'd like to make a query to fetch the most recent of each log entry by user_id. I can't seem to figure out how to do that...

The SQL for the query would be something like this:

SELECT * 
FROM table a 
JOIN (SELECT user_id, max(datetime) maxDate
        FROM table
      GROUP BY user_id) b
ON a.user_id = b.user_id AND a.datetime = b.maxDate

Right now I'm doing it with a raw query, but I'd like to use the ORM's methods.

Tim Tisdall
  • 9,914
  • 3
  • 52
  • 82

3 Answers3

1

I suppose this should do

    Table.objects.order_by('-user_id').distinct('user_id')

See distinct() in this -> https://docs.djangoproject.com/en/1.9/ref/models/querysets/

But this will only work if the latest log entry for that user is the last entry by that user in the table, i.e., the log entries of a particular user as sorted in ascending way in the table.

Ankush Raghuvanshi
  • 1,392
  • 11
  • 17
  • This gives me the solution I'm looking for (provided I'm using Postgres, which I am), but I was looking more for a way to replicate the given SQL via the ORM to better understand how to do that in the ORM. – Tim Tisdall Jul 18 '16 at 19:38
  • @TimTisdall, you can go through this for a better solution. :) See this -> http://stackoverflow.com/questions/629551/how-to-query-as-group-by-in-django – Ankush Raghuvanshi Jul 18 '16 at 19:41
  • I saw that one... Unfortunately the `.annotate` does a `GROUP BY` on all values listed. I'd like to group by only the `user_id` and `max(datetime)` but still get back the whole object. So, the result would be just the inner subquery but not the join. – Tim Tisdall Jul 18 '16 at 19:45
  • 1
    Yea, i tried to modify the solution in the link i specified according to what you wanted but was facing the same issue that you just mentioned. Hence had to go for distinct(). – Ankush Raghuvanshi Jul 18 '16 at 20:23
  • I think you may need `datetime` in the `order_by` to make sure it's the most recent for each user. – Tim Tisdall Jul 19 '16 at 12:48
  • It may be required in case when the log entries for a particular user are not already sorted according to datetime, but as new entries in any db are added to the bottom of the table and not randomly in b/w hence, these are already sorted according to the datetime values of when they were created. Had the datetime value correspond to something else, say the date of purchase of some item or something like that, then definitely the order_by(datetime) would have been required. I saved it as it was not required in your case. :) – Ankush Raghuvanshi Jul 19 '16 at 12:59
0

You may try:

 User.objects.latest().id
rkatkam
  • 2,634
  • 3
  • 18
  • 29
0
Table.objects.order_by('user_id', '-datetime').distinct('user_id')

Add indexes to user_id and datetime (Meta.index_together = ['user_id', 'datetime']).

Vladimir Danilov
  • 2,338
  • 14
  • 15