2

I have a problem when making a queryset using Q objects. I'm getting different results depending on how i order some Q conditions. I will simplify my models a little so as to describe my problem in a clean way.

class D(models.Model):
    one_attr = models.BooleanField()
    s_attr = models.ManyToManyField(S, through='DRelatedToS')
    d_to_p = models.ForeignKey(P)


class S(models.Model):
    other_attr = models.BooleanField()
    s_to_p = models.ForeignKey(P)


class DRelatedToS(models.Model):
    to_s = models.ForeignKey(S)
    to_d = models.ForeignKey(D)
    date = models.DateField()


class P(models.Model):
    the_last_attr = models.PositiveIntegerField()

Summary of the relations:

D <-- DRelatedToS --> S --> P
|                           ^
|                           |
-------->------->------>----^

With these models and relations, i get two different results depending on how i arrange Q conditions: First query, that gives one result

D.objects.filter(
    Q(one_attr=True, s_attr__s_to_p__the_last_attr=5)
    |
    Q(one_attr=False, d_to_p__the_last_attr=10)
)

Second query, giving another result, different from first query

D.objects.filter(
    Q(one_attr=False, d_to_p__the_last_attr=10)
    |
    Q(one_attr=True, s_attr__s_to_p__the_last_attr=5)
)

My question is: why is this happening? Is there any problem on how i am doing my query?

When i watch the SQL statements derived from these queries, i get two different statements: one that make a LEFT OUTER JOIN and a lot of INNER JOINs and the second that makes all INNER JOINs. The one that actually return what i want is the one that makes a LEFT OUTER JOIN. This make me feel that all my queries can return bad results depending on how i arrange its conditions. Is this a bug or i am doing anything (or everything) wrong?

Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49
marianobianchi
  • 8,238
  • 1
  • 20
  • 24

3 Answers3

3

Different order should return equal result in your example.

Nevertheless I tested your code (using corrections I made in the question's codes) but can't generate the error you describe. Maybe you had introduced other errors and missed when simplified the code, could you post the sample data you used? (see my data below).

First your sample code is buggy I had edited your question to suggest following corrections to fix problems, simplify and improve it for tests, but I don't see the changes updated so I repeat here:

Correction 1: Model changes in a diff format:

3,4c6,10
<     s_attr = models.ManyToMany(S, through='DRelatedToS')
<     d_to_p = models.ForeignKey(P)
---
>     s_attr = models.ManyToManyField('S', through='DRelatedToS')
>     d_to_p = models.ForeignKey('P')
> 
>     def __unicode__(self):
>         return 'D:(%s,%s,%s)' % (self.id, self.one_attr, self.d_to_p.the_last_attr)
8,9c14
<     other_attr = models.BooleanField()
<     s_to_p = models.ForeignKey(P)
---
>     s_to_p = models.ForeignKey('P')
13d17
<     to_p = models.ForeignKey(P)
15c19
<     date = models.DateField()
---
>     to_s = models.ForeignKey(S)
19a24
> 

So after apply corrections models look like this:

class D(models.Model):
    one_attr = models.BooleanField()
    s_attr = models.ManyToManyField('S', through='DRelatedToS')
    d_to_p = models.ForeignKey('P')

    def __unicode__(self):
        return 'D:(%s,%s,%s)' % (self.id, self.one_attr, self.d_to_p.the_last_attr)


class S(models.Model):
    s_to_p = models.ForeignKey('P')


class DRelatedToS(models.Model):
    to_d = models.ForeignKey(D)
    to_s = models.ForeignKey(S)


class P(models.Model):
    the_last_attr = models.PositiveIntegerField()

Correction 2: Your lookup fields in queries are wrong (Fixed in answer).

Following is what I did:

  1. Create project and app named testso:

    django-admin.py startproject marianobianchi
    cd marianobianchi
    python manage.py startapp testso
    
  2. Add your models & adjust project settings (database settings, add testso to INSTALLED_APPS)

  3. Add sample data:

    mkdir testso/fixtures
    cat > testso/fixtures/initial_data.json
    [
        {"pk": 1, "model": "testso.d", "fields": {"one_attr": true, "d_to_p": 3}},
        {"pk": 2, "model": "testso.d", "fields": {"one_attr": true, "d_to_p": 4}},
        {"pk": 3, "model": "testso.d", "fields": {"one_attr": false, "d_to_p": 5}},
        {"pk": 4, "model": "testso.d", "fields": {"one_attr": false, "d_to_p": 5}},
    
        {"pk": 1, "model": "testso.s", "fields": {"s_to_p": 1}},
        {"pk": 2, "model": "testso.s", "fields": {"s_to_p": 2}},
        {"pk": 3, "model": "testso.s", "fields": {"s_to_p": 3}},
    
        {"pk": 1, "model": "testso.drelatedtos", "fields": {"to_d": 2, "to_s": 1}},
        {"pk": 2, "model": "testso.drelatedtos", "fields": {"to_d": 1, "to_s": 2}},
        {"pk": 3, "model": "testso.drelatedtos", "fields": {"to_d": 1, "to_s": 3}},
    
        {"pk": 1, "model": "testso.p", "fields": {"the_last_attr": 5}},
        {"pk": 2, "model": "testso.p", "fields": {"the_last_attr": 5}},
        {"pk": 3, "model": "testso.p", "fields": {"the_last_attr": 3}},
        {"pk": 4, "model": "testso.p", "fields": {"the_last_attr": 4}},
        {"pk": 5, "model": "testso.p", "fields": {"the_last_attr": 10}}
    ]
    
  4. python manage.py syncdb

  5. python manage.py shell
  6. In the shell:

    >>> from testso.models import *
    >>> from django.db.models import Q
    >>> D.objects.filter(Q(one_attr=True, s_attr__s_to_p__the_last_attr=5) | Q(one_attr=False, d_to_p__the_last_attr=10))
    [<D: D:(1,True,3)>, <D: D:(2,True,4)>, <D: D:(3,False,10)>, <D: D:(4,False,10)>]
    >>> D.objects.filter(Q(one_attr=False, d_to_p__the_last_attr=10) | Q(one_attr=True, s_attr__s_to_p__the_last_attr=5))
    [<D: D:(1,True,3)>, <D: D:(2,True,4)>, <D: D:(3,False,10)>, <D: D:(4,False,10)>]
    

Same result! ...as expected.

juliocesar
  • 5,706
  • 8
  • 44
  • 63
  • 2
    Thanks for your answer! I will try this later to see if it works. About your corrections, i edited my question to fix my bugs, but some of them weren't bugs (like 'P' instead of P)... If you want edit your answer so as to make it shorter. I wanted to accept your edit but i think i don't have the reputation needed... – marianobianchi Nov 16 '13 at 22:56
  • Referencing a Model that you don't has defined before raises an exception, so that's why you need 'P' insted of P ...I will propose corrections again, I think you don't need reputation to accept this proposal ;) – juliocesar Nov 16 '13 at 23:04
  • proposal changes done, after I see your question updated I will delete from my answer – juliocesar Nov 16 '13 at 23:09
  • ok... i understand what you say, but you know that is not the problem... About your proposal, i see a notification about an edition proposed but when i click there, i can see your changes bu don't have any button to accept them.... – marianobianchi Nov 17 '13 at 08:59
  • 1
    You should see buttons to approve/reject the edit suggestion, maybe [this](http://meta.stackexchange.com/questions/184992/why-does-the-community-user-approve-and-reject-edits) and [this](http://blog.stackoverflow.com/2011/02/suggested-edits-and-edit-review/) helps you. You should think it not solve your problem, but helps peoples to run your test example, if it not runs how can any try to solve it?, I fixed it and suggests the change to make easier to find solutions not only by me. – juliocesar Nov 17 '13 at 15:04
  • ok... it says it was rejected 18 hours ago by three other users... :( – marianobianchi Nov 17 '13 at 18:18
  • It seems that reviewers doesn't like changes in the code. I will remove **Correction 2** as I see fixed in your answer. – juliocesar Nov 18 '13 at 01:59
2

I cannot answer your question directly, but there may be another way of doing what you want to do which may yield more consistent results:

subset_a = D.objects.filter(one_attr=False, d_to_p__the_last_attr=10)
subset_b = D.objects.filter(one_attr=True, s_attr__p__the_last_attr=5)
union_set = subset_a | subset_b
union_set = union_set.distinct()

The | operator on two querysets does a union, and the distinct will make sure you do not get dupe rows. I would be interested in hearing whether the arrangement here matters as well.

Krystian Cybulski
  • 10,789
  • 12
  • 67
  • 98
  • Thanks for your answer! But i don't think this is working. While subset_a and subset_b gives me 25 and 0 results, when i do `subset_a | subset_b` or `subset_b | subset_a` i obtain the same result: 0 (zero.... empty set). Beside these results, i dind't find any official comment on django project page about that operator between two querysets... are you sure it works as you said? – marianobianchi Nov 15 '13 at 05:01
  • I have used the | operator on QuerySets multiple times, and it has worked for me. I have not found official python docs on this, but other StackOverflow answers corroborate my suggestion: http://stackoverflow.com/questions/4411049/how-can-i-find-the-union-of-two-django-querysets . Can you see what the result of `str(subset_a.query)`, `str(subset_b.query)`, and `str(union_set.query)` is? – Krystian Cybulski Nov 15 '13 at 11:04
  • Documentation for | https://docs.djangoproject.com/en/dev/topics/db/queries/#complex-lookups-with-q – Martin Krung Apr 20 '17 at 10:07
0

Django seems to have a weak ORM implementation. I would suggest using "filter" or some other way to query the database and see if you get the same inconsistencies.

Or perhaps you should look for alternative ORM implementations, like peewee.

OCarneiro
  • 327
  • 3
  • 6