10

I'm trying to filter users by date, but can't until I can find the first and last date of users in the db. While I can have my script filter out dups later on, I want to do it from the outset using Django's distinct since it significantly reduces. I tried

User.objects.values('install_time').distinct().order_by()

but since install_time is a timestamp, it includes the date AND time (which I don't really care about). As a result, the only ones it filters out are dates where we could retrieve multiple users' install dates but not times.

Any idea how to do this? I'm running this using Django 1.3.1, Postgres 9.0.5, and the latest version of psycopg2.

EDIT: I forgot to add the data type of install_time:

install_time = models.DateTimeField()

EDIT 2: Here's some sample output from the Postgres shell, along with a quick explanation of what I want:

 2011-09-19 00:00:00
 2011-09-11 00:00:00
 2011-09-11 00:00:00 <--filtered out by distinct() (same date and time)
 2011-10-13 06:38:37.576
 2011-10-13 00:00:00 <--NOT filtered out by distinct() (same date but different time)

I am aware of Manager.raw, but would rather user django.db.connection.cursor to write the query directly since Manager.raw returns a RawQuerySet which, IMO, is worse than just writing the SQL query manually and iterating.

Edwin
  • 2,074
  • 1
  • 21
  • 40

2 Answers2

8

When doing reports on larger datasets itertools.group_by might be too slow. In those cases I make postgres handle the grouping:

truncate_date = connection.ops.date_trunc_sql('day','timestamp')
qs = qs.extra({'date':truncate_date})
return qs.values('date').annotate(Sum('amount')).order_by('date')
tback
  • 11,138
  • 7
  • 47
  • 71
0

I've voted to close this since it's a dup of this question, so here's the answer if you don't want to visit the link, courtesy of nosklo.


Create a small function to extract just the date: def extract_date(entity): 'extracts the starting date from an entity' return entity.start_time.date()

Then you can use it with itertools.groupby:

from itertools import groupby

entities = Entity.objects.order_by('start_time')
for start_date, group in groupby(entities, key=extract_date):
    do_something_with(start_date, list(group))
Community
  • 1
  • 1
Edwin
  • 2,074
  • 1
  • 21
  • 40
  • It's still useful to have your question around since you couldn't find the answer with 30 mins searching google. You should accept this answer and let it be :) Next time somebody uses your language to search this problem, this Q will come up on google. – Yuji 'Tomita' Tomita Dec 21 '11 at 23:44
  • @YujiTomita That's why I'm closing, not deleting. I still have closed answers show up on Google. – Edwin Dec 22 '11 at 15:24