Filter models by checking one datetime's month against another

Question

Using Django's ORM, I am trying to find instances of myModel based on two of its datetime variables; specifically, where the months of these two datetimes are not equal. I understand to filter by the value of a modelfield, you can use Django's F( ) expressions, so I thought I'd try something like this:

myModel.objects.filter(fixed_date__month=F('closed_date__month'))

I know this wouldn't find instances where they aren't equal, but I thought it'd be a good first step since I've never used the F expressions before. However, it doesn't work as I thought it should. I expected it to give me a queryset of objects where the value of the fixed_date month was equal to the value of the closed_date month, but instead I get an error:

FieldError: Join on field 'closed_date' not permitted. Did you misspell 'month' for the lookup type?

I'm not sure if what I'm trying to do isn't possible or straightforward with the ORM, or if I'm just making a simple mistake.

Can you provide a small data set and then what your desired outcome of the query would be? — Scott Woodall, Nov 13 '13 at 17:57
I've tried similar queries in the shell, and it looks like django can't convert the 'dateObject__month' string in the F object into an sql query, hence the 'Join' error. Possibly best implemented as a loop with objects? I think the F object is still fairly new, so they may sort it out at some point. — Henry Florence, Nov 13 '13 at 18:48

Henry Florence · Accepted Answer · 2013-11-15T19:26:20.403

2

It doesn't look like django F objects currently support extracting the month inside a DateTimeField, the error message seems to be stating that the F object is trying to convert the '__' inside the string 'closed_date__month' as a Foreignkey between different objects, which are usually stored as joins inside an sql database.

You could carry out the same query by iterating across the objects:

result = []
for obj in myModel.objects.all():
    if obj.fixed_date.month != obj.closed_date.month:
        result.append(obj)

or as a list comprehension:

result = [obj for obj in myModel.objects.all() if obj.fixed_date.month != obj.closed_date.month]

Alternatively, if this is not efficient enough, the months for the two dates could be cached as IntegerFields within the model, something like:

 class myModel(models.Model):
     ....other fields....
     fixed_date = models.DateTimeField()
     closed_date = models.DateTimeField()
     fixed_month = models.IntegerField()
     closed_month = models.IntegerField()

store the two integers when the relevant dates are updated:

myModel.fixed_month = myModel.fixed_date.month
myModel.save()

Then use an F object to compare the two integer fields:

myModel.objects.filter(fixed_month__ne=F('closed_month'))

The ne modifier will do the not equal test.

Edit - using raw sql

If you are using an sql based database, then most efficient method is to use the .raw() method to manually specify the sql:

myModel.objects.raw('SELECT * FROM stuff_mymodel WHERE MONTH(fixed_date) != MONTH(close_date)')

Where 'stuff_mymodel' is the correct name of the table in the database. This uses the SQL MONTH() function to extract the values from the month fields, and compare their values. It will return a collection of objects.

There is some nay-saying about the django query system, for example: http://charlesleifer.com/blog/shortcomings-in-the-django-orm-and-a-look-at-peewee-a-lightweight-alternative/. This example could be taken as demonstrating another inconsistency in it's query api.

edited Nov 15 '13 at 19:26

answered Nov 13 '13 at 20:28

Henry Florence

2,848
19
16

It's better to have the months as a property and not as another integer column in the database – yuvi Nov 15 '13 at 19:02
A property of what? and in what way better? Do you mean more time or space efficient? – Henry Florence Nov 15 '13 at 19:05
A [property of the model](https://docs.djangoproject.com/en/dev/glossary/#term-property), it's a [pythonic way](http://docs.python.org/2.7/library/functions.html#property) to have methods that are managed as if they are a field. It's probably not more efficient than saving it to the db, but it definitly makes a lot more sense – yuvi Nov 15 '13 at 19:31
Also, instead of saving two additional fields for each month, it makes a lot more sense to have **one boolean field** that is auto-incremented according to the two saved fields and represents a difference between the two month (True if there is a difference and False if there isn't). I'm usually against such a solution but it would prove efficient (probably) – yuvi Nov 15 '13 at 19:50
Ah, I think it is very hard to say (probably ;) without seeing more about how the model is used. I offered the object based solution first, as this would be easiest to implement (one line of code), and the raw sql solution last, as the more efficient but most implementation dependent. Having said that, your answer is also very good, IMHO. The question does not ask for an alternative solution, merely why F() expression's don't behave as expected. – Henry Florence Nov 15 '13 at 19:54
To be precise, what resource is being wasted? – Henry Florence Nov 15 '13 at 20:12
You have two integer fields instead of one boolean field. And you're still working hard to compare them (instead of having one True\False field that can be filtered instantly) – yuvi Nov 15 '13 at 20:18
let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/41290/discussion-between-henry-florence-and-yuvi) – Henry Florence Nov 15 '13 at 20:19
ok, so sometimes data integrity (ensuring we don't have a invalid BooleanField that hasn't been updated correctly) comes at the expense of space efficiency (a couple more bytes in the database). It really depends on how the application... – Henry Florence Nov 15 '13 at 20:28

yuvi · Answer 2 · 2013-11-15T22:56:55.820

1

My thinking is this:

class myModel(models.Model):
    fixed_date = models.DateTimeField()
    closed_date = models.DateTimeField()

    def has_diff_months(self):
        if self.fixed_date.month != self.closed_date.month:
            return True
        return False

Then:

[x for x in myModel.objects.all() if x.has_diff_months()]

However, for a truly efficient solution you'd have to use another column. It makes sense to me that it'd be a computed boolean field that is created when you save, like so:

class myModel(models.Model):
    fixed_date = models.DateTimeField()
    closed_date = models.DateTimeField()
    diff_months = models.BooleanField()

    #overriding save method
    def save(self, *args, **kwargs):
        #calculating the value for diff_months
        self.diff_months = (self.fixed_date.month != self.closed_date.month)
        #aaand... saving:              
        super(Blog, self).save(*args, **kwargs)

Then filtering would simply be:

myModel.objects.filter(diff_months=True)

edited Nov 15 '13 at 22:56

answered Nov 15 '13 at 19:43

yuvi

18,155
8
56
93

Might this not clutter the model with unnecessary application code? – Henry Florence Nov 15 '13 at 19:57
That really depends, but for re-usability and ease-of-filtering this is a good approach – yuvi Nov 15 '13 at 20:08
This won't work: http://stackoverflow.com/questions/1205375/filter-by-property. Do you mean to use a list comprehension or generator expression instead? – Henry Florence Nov 15 '13 at 20:48
You're right, that was a misconception on my part. Thank you for that, I will fix or delete my answer shortly – yuvi Nov 15 '13 at 22:00
@HenryFlorence I updated my answer, also edited in the suggestion I gave you for the boolean solution. – yuvi Nov 15 '13 at 22:12
Nice answer, loosing a couple of bytes in storage to allow a simple, expressive query and use of the `save()` method prevents creating invalid objects. You could use: `self.diff_months = (self.fixed_date.month != self.closed_date.month)` in the `save()` method. – Henry Florence Nov 15 '13 at 22:50
Thank you. Though to be frank I'm not a big fan of saving calculated data to the db. It always seems kinda wrong to me. Anyway, added your suggestion in there too – yuvi Nov 15 '13 at 22:54
Yeah, I think using raw sql to make up for defiances in the query interface is possibly the best fall back. – Henry Florence Nov 15 '13 at 23:00

Filter models by checking one datetime's month against another

2 Answers2