3

We're encountering a very strange problem regarding the negation of Q objects in Django. Let's just use Football as the example:

class Team(models.Model):
    id = UUIDField(primary_key=True)

class Player(models.Model):
    id = UUIDField(primary_key=True)
    name = models.CharField(max_length=128)
    team = models.ForeignKey(Team)
    touchdowns = models.IntegerField()

There are 10 teams.

There are 100 players, with 10 on each team. Each team has a player named "Joe". There is one "Joe" on one team who has scored 5 touchdowns. All other Joe's have scored 1 touchdown. There are 8 teams where every Player has scored only 1 touchdown.

I want to get the Teams that have a player named Joe that has scored at least 3 Touchdowns.

models.Team.objects.filter(Q(player__name="Joe", player__touchdowns__gte=3)).count()

That returns One, as it should.The negation of that should return 9 (The other 9 teams that don't have a player named Joe that has at least 3 Touchdowns):

models.Team.objects.filter(~Q(player__name="Joe", player__touchdowns__gte=3)).count()

instead returns any team where everyone on that team has less than 3 Touchdowns (8).

Where am I going wrong? Please note that our actual application of this is much more complicated, so we NEED to use Q objects with negation, we cannot use Exclude.

Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49
MatthewKremer
  • 1,549
  • 1
  • 14
  • 25
  • I'd certainly consider this a bug, since filter(Q()) is expected to return the inverse of both filter(~Q()) and exclude(Q()). There are several related bugs reported as fixed at https://code.djangoproject.com/ticket/16893 https://code.djangoproject.com/ticket/19672 https://code.djangoproject.com/ticket/24705 but they're all crash-related, not "the wrong filter" as in your case. FWIW, I'm getting a similar issue on Django 1.8.7 – Sarah Messer Dec 25 '15 at 20:37

3 Answers3

3

The best way to ferret out why these differences occur is to investigate the queries that are generated: django-debug-toolbar comes with a debugsqlshell command that prints the actual query sent to the database after any use of the Django queryset API. For these tests, I used the User model with a join on Group. I too noticed different counts for selected objects, so on the face it seems like a good correlation to your use-case.

User.objects.filter(~Q(username='jdoe', groups__name='Awesome Group'))

SELECT "auth_user"."id",
       "auth_user"."username",
       "auth_user"."first_name",
       "auth_user"."last_name",
       "auth_user"."email",
       "auth_user"."password",
       "auth_user"."is_staff",
       "auth_user"."is_active",
       "auth_user"."is_superuser",
       "auth_user"."last_login",
       "auth_user"."date_joined"
FROM "auth_user"
WHERE NOT ("auth_user"."username" = 'jdoe'
           AND "auth_user"."id" IN
             (SELECT U1."user_id"
              FROM "auth_user_groups" U1
              INNER JOIN "auth_group" U2 ON (U1."group_id" = U2."id")
              WHERE (U2."name" = 'Awesome Group'
                     AND U1."user_id" IS NOT NULL))) LIMIT 21

User.objects.exclude(Q(username='jdoe', groups__name='Awesome Group'))

SELECT "auth_user"."id",
       "auth_user"."username",
       "auth_user"."first_name",
       "auth_user"."last_name",
       "auth_user"."email",
       "auth_user"."password",
       "auth_user"."is_staff",
       "auth_user"."is_active",
       "auth_user"."is_superuser",
       "auth_user"."last_login",
       "auth_user"."date_joined"
FROM "auth_user"
INNER JOIN "auth_user_groups" ON ("auth_user"."id" = "auth_user_groups"."user_id")
INNER JOIN "auth_group" ON ("auth_user_groups"."group_id" = "auth_group"."id")
WHERE NOT (("auth_user"."username" = 'jdoe'
            AND "auth_group"."name" = 'Awesome Group')) LIMIT 21

The difference here comes in where the INNER JOIN happens. The Q object causes the INNER JOIN in the first example and then the selection with the INNER JOIN is negated because of the ~. The case of exclude, the negation happens in parallel to the INNER JOIN.

Chris Pratt
  • 232,153
  • 36
  • 385
  • 444
  • Upon examining the queries, here is real problem: When using .filter(~Q(child__stuff, child__otherstuff)) we receive SQL along the lines of "preliminary requirements AND NOT ( team.id IN (SELECT U1.player_id FROM data_player U1 WHERE (U1.touchdowns>=3 )) AND team.id IN (SELECT U1.player_id FROM data_player U1 WHERE (U1.name='JOE' ))" As you can see, the issue is the fact that it does two inner queries, one for each requirement. However, if I manualy move the JOE part next to the >=3 part and run in console, it works. Long story short, how do I make ~Q() or something else act like that? – MatthewKremer Aug 03 '12 at 19:52
  • Rewrite `Q`, submit a pull request to Django, and hope it gets merged, basically. This is how `Q` functions, so there's no way to side-step it. You can try to tweak your logic so that it results in what you think it should, but the query that `Q` generates is the query that `Q` generates. – Chris Pratt Aug 03 '12 at 19:57
  • 1
    Alright, now in your honest opinion, is the way Q currently functions the "correct way" or is the way I'm thinking the "correct" way in terms of how the SQL should be built? – MatthewKremer Aug 03 '12 at 19:58
  • That's the thing. I don't think there's necessarily a "correct" and "incorrect" way. `Q` builds a query in a specific way for reasons I can't fully attest to. It works just fine in 99.99% of scenarios. In this scenario, it seems to fail. However, that doesn't mean it's wrong, it just means it doesn't cover every possible usage scenario, which if we're being honestly is a pretty impossible task any way you cut it. – Chris Pratt Aug 03 '12 at 20:21
  • True, I can agree to that. We found a workaround, but I'd still like to look at personally modifying Django. If you would like to write an answer explaining that that is the way Q functions (separate inner queries per option), I'll accept it as the answer. Thanks for your input! – MatthewKremer Aug 03 '12 at 20:24
1

I had a similar case and found that, using boolean logic to put the negations on the leaves of the tree solved the issue. I made a Django snippet here : https://djangosnippets.org/snippets/10866/ Here is a copy of my code :

def put_Q_negations_to_leaves(
    query_filter: Q,
    negate: bool = False,
    first_call: bool = True,
    debug: bool = False,
):
    negate_below = (negate != query_filter.negated)  # XOR
    if debug:
        logger.info(
            f"put_Q_negations_to_leaves() query_filter:{query_filter}"
            f" negate:{negate} negate_below:{negate_below}"
        )
    true_kwargs = {
        "_connector": query_filter.connector,
        "_negated": False,
    }
    new_children = []
    for child in query_filter.children:
        if debug:
            logger.info(child.__repr__())
        if not isinstance(child, Q):
            if negate_below:
                new_child = ~Q(child)
            else:
                new_child = child
        else:
            new_child = put_Q_negations_to_leaves(child, negate=negate_below, first_call=False)
        if debug:
            logger.info(new_child.__repr__())
        new_children.append(new_child)
    if len(new_children) == 1:
        # One child
        if isinstance(new_children[0], Q) or first_call == False:
            # Double negation canceled out if possible
            return new_children[0]
        else:
            true_kwargs["_negated"] = negate_below
    if negate_below:
        if true_kwargs["_connector"] == Q.AND:
            true_kwargs["_connector"] = Q.OR
        else:
            true_kwargs["_connector"] = Q.AND
    return Q(*new_children, **true_kwargs)

To make this snippet works in all cases, it is necessary to change the following lines :

if negate_below:
    new_child = ~Q(child)

You must handle all negation of field lookups : https://docs.djangoproject.com/en/4.0/ref/models/querysets/#field-lookups-1 with string manipulation on the first element of the tuple.

For that, you can look at this answer on StackOverflow : How do I do a not equal in Django queryset filtering? https://stackoverflow.com/a/29227603/5796086

However, for most uses, it will be simpler to use a SubQuery (or Exists).

Use example :

from django.db.models import Q, F

# For simplicity, and avoiding mixing args and kwargs, we only use args since :
# ("some_fk__some_other_fk__some_field", 111) arg
# is equivalent to
# some_fk__some_other_fk__some_field=111 kwarg

unmodified_filter = ~Q(
  ("some_fk__some_other_fk__some_field", 111),
  Q(("some_fk__some_other_fk__some_other_field__lt", 11))
  | ~Q(("some_fk__some_other_fk__some_yet_another_field", F("some_fk__some_yet_another_field")))
)

modified_filter = put_Q_negations_to_leaves(unmodified_filter)
print(unmodified_filter)
print(modified_filter)

This will output something that you can beautify like this:

Before:
(NOT
   (AND:
      ('some_fk__some_other_fk__some_field', 111),
      (OR:
         ('some_fk__some_other_fk__some_other_field__lt', 11),
         (NOT
            (AND: ('some_fk__some_other_fk__some_yet_another_field', F(some_fk__some_yet_another_field)))
         )
      )
   )
)

After:
(OR:
   (NOT
      (AND: ('some_fk__some_other_fk__some_field', 111))
   ),
   (AND:
      (NOT
         (AND: ('some_fk__some_other_fk__some_other_field__lt', 11))
      ), <-- This is where negation of lookups like "lt" -> "gte" should be handled
      ('some_fk__some_other_fk__some_yet_another_field', F(some_fk__some_yet_another_field))  <-- Double negation canceled out
   )
)
Laurent Lyaudet
  • 630
  • 8
  • 16
0

Сheck queries in database with Q and ~Q:

>>> print models.Team.objects.filter(Q(player__name="Joe", player__touchdowns__gte=3)).query
>>> print models.Team.objects.filter(~Q(player__name="Joe", player__touchdowns__gte=3)).query

And test it with two Q objects:

>>> print models.Team.objects.filter(Q(player__name="Joe") & Q(player__touchdowns__gte=3)).query
>>> print models.Team.objects.filter(~Q(player__name="Joe") & ~Q(player__touchdowns__gte=3)).query
Denis SkS
  • 816
  • 2
  • 7
  • 15
  • We're actually build a dynamic query builder, so in my app I have to take each Q(inputtedquery) and &= them together to allow dynamic querying. We've tried many combinations of positioning the ~ and writing out manual queries to test including the above to no avail. Only successful attempt was manually moving the SQL from .query. See my reply to Chris' post. – MatthewKremer Aug 03 '12 at 19:55