0

If I have models like this:

class Person(models.Model):
    first_name = models.CharField(max_length=20)
    last_name = models.CharField(max_length=20)

class PersonSession(models.Model):
    start_time = models.DateTimeField(auto_now_add=True)
    end_time = models.DateTimeField(null=True,
                                    blank=True)
    person = models.ForeignKey(Person, related_name='sessions')

class Billing(models.Model):
    DEBT = 'DE'
    BALANCED = 'BA'
    CREDIT = 'CR'

    session = models.OneToOneField(PersonSession,
                                   blank=False,
                                   null=False,
                                   related_name='billing')
    STATUS = ((BALANCED, 'Balanced'),
              (DEBT, 'Debt'),
              (CREDIT, 'Credit'))

    status = models.CharField(max_length=2,
                              choices=STATUS,
                              blank=False,
                              default=BALANCED
                              )

The two below query generates different results:

Person.objects.filter(Q(sessions__start_time__gte='2000-02-01') & \
                      Q(sessions__start_time__lte='2000-03-01') & \
                      Q(sessions__billing__status=Billing.DEBT))
OR
Person.objects.filter(Q(sessions__start_time__gte='2000-02-01') & \
                      Q(sessions__start_time__lte='2000-03-01')).filter(
                      Q(sessions__billing__status=Billing.DEBT))

The first one generates two person 1,2 and the second one generates all three persons that I have, the data is as below:

id | first_name | last_name | id |        start_time         |         end_time          | person_id | id | status | session_id 
---+------------+-----------+----+---------------------------+---------------------------+-----------+----+--------+------------
0 | person     | 0         |  0 | 2000-01-01 16:32:00+03:30 | 2000-01-01 17:32:00+03:30 |         0 |  0 | DE     |          0
0 | person     | 0         |  1 | 2000-02-01 16:32:00+03:30 | 2000-02-01 17:32:00+03:30 |         0 |  1 | BA     |          1
0 | person     | 0         |  2 | 2000-03-01 16:32:00+03:30 | 2000-03-01 17:32:00+03:30 |         0 |  2 | DE     |          2
1 | person     | 1         |  3 | 2000-01-01 16:32:00+03:30 | 2000-01-01 17:32:00+03:30 |         1 |  3 | BA     |          3
1 | person     | 1         |  4 | 2000-02-01 16:32:00+03:30 | 2000-02-01 17:32:00+03:30 |         1 |  4 | DE     |          4
1 | person     | 1         |  5 | 2000-03-01 16:32:00+03:30 | 2000-03-01 17:32:00+03:30 |         1 |  5 | DE     |          5
2 | person     | 2         |  6 | 2000-01-01 16:32:00+03:30 | 2000-01-01 17:32:00+03:30 |         2 |  6 | DE     |          6
2 | person     | 2         |  7 | 2000-02-01 16:32:00+03:30 | 2000-02-01 17:32:00+03:30 |         2 |  7 | DE     |          7
2 | person     | 2         |  8 | 2000-03-01 16:32:00+03:30 | 2000-03-01 17:32:00+03:30 |         2 |  8 | BA     |          8

I wrote an example that creates class and data for them to show the difference.

Please explain what is difference between these two queries and as a side question if in django-filter package I want to define some fields to work as a first filter how could I do this?

motam
  • 677
  • 1
  • 6
  • 24
  • Based on the example data, is it possible to see the result of executing the first query and the result of executing the second query? I know you have a github project to do that, but it would be helpful if you could post the results of the two queries here. – Will Keeling May 09 '18 at 12:29
  • The first query result: ` `and the second one: `` – motam May 09 '18 at 13:02

1 Answers1

1

The queries generated by the filters are different as follows:

>>> print Person.objects.filter(Q(sessions__start_time__gte='2000-02-01') & \
...                       Q(sessions__start_time__lte='2000-03-01')).filter(
...                       Q(sessions__billing__status=Billing.DEBT)).query
SELECT "test_filter_person"."id", "test_filter_person"."first_name", "test_filter_person"."last_name" FROM "test_filter_person" INNER JOIN "test_filter_personsession" ON ("test_filter_person"."id" = "test_filter_personsession"."person_id") INNER JOIN "test_filter_personsession" T3 ON ("test_filter_person"."id" = T3."person_id") INNER JOIN "test_filter_billing" ON (T3."id" = "test_filter_billing"."session_id") WHERE ("test_filter_personsession"."start_time" >= 2000-02-01 00:00:00+00:00 AND "test_filter_personsession"."start_time" <= 2000-03-01 00:00:00+00:00 AND "test_filter_billing"."status" = DE)
>>> print Person.objects.filter(Q(sessions__start_time__gte='2000-02-01') & \
...                       Q(sessions__start_time__lte='2000-03-01') & \
...                       Q(sessions__billing__status=Billing.DEBT)).query
SELECT "test_filter_person"."id", "test_filter_person"."first_name", "test_filter_person"."last_name" FROM "test_filter_person" INNER JOIN "test_filter_personsession" ON ("test_filter_person"."id" = "test_filter_personsession"."person_id") INNER JOIN "test_filter_billing" ON ("test_filter_personsession"."id" = "test_filter_billing"."session_id") WHERE ("test_filter_personsession"."start_time" >= 2000-02-01 00:00:00+00:00 AND "test_filter_personsession"."start_time" <= 2000-03-01 00:00:00+00:00 AND "test_filter_billing"."status" = DE)

Let's call them Q1 and Q2 respectively.

As you can see, Q1 does the join between Person, PersonSession, and Billing once, and all the conditions are applied together in the WHERE clause.

However, in Q2, since you terminate the first filter with the first two checks, this results in a first query join between Person and PersonSession first (for Person.objects.filter(Q(sessions__start_time__gte='2000-02-01') & Q(sessions__start_time__lte='2000-03-01')), the result this first join is again joined with PersonSession and Billing in order to execute the second part of the filter query .filter(Q(sessions__billing__status=Billing.DEBT)).

You can read more here: difference between filter with multiple arguments and chain filter in django

Also want to draw your attention to the last line of the post:

One table: But if the query doesn't involve joined tables like the example from Yuji and DTing. The result is same.

rtindru
  • 5,107
  • 9
  • 41
  • 59
  • May be this seems an XY problem but I had [another question](https://stackoverflow.com/q/50001616/2335245) which basically was about django-filter and for finding solution for that problem I notice the difference between these two queries. As the end of this question stated, do you think it is possible to do the first query in django-filter because the first one usually is the one anyone expected. – motam May 09 '18 at 14:12
  • I'll try to answer that question on the other thread. If the response here is satisfactory; please close this thread. – rtindru May 10 '18 at 12:19