The Problem
I'm trying to use the Django ORM to do the equivalent of a SQL NOT IN
clause, providing a list of IDs in a subselect to bring back a set of records from the logging table. I can't figure out if this is possible.
The Model
class JobLog(models.Model):
job_number = models.BigIntegerField(blank=True, null=True)
name = models.TextField(blank=True, null=True)
username = models.TextField(blank=True, null=True)
event = models.TextField(blank=True, null=True)
time = models.DateTimeField(blank=True, null=True)
What I've Tried
My first attempt was to use exclude
, but this does NOT
to negate the entire Subquery
, rather than the desired NOT IN
:
query = (
JobLog.objects.values(
"username", "job_number", "name", "time",
)
.filter(time__gte=start, time__lte=end, event="delivered")
.exclude(
job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)
)
)
Unfortunately, this yields this SQL:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log"
WHERE (
"view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
AND NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished' AND U0."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
)
What I need is for the third AND
clause to be AND "view_job_log"."job_number" NOT IN
instead of the AND NOT (
.
I've also tried doing the sub-select as it's own query first, with an exclude
, as suggested here:
Django equivalent of SQL not in
However, this yields the same problematic result. Then I tried a Q
object, which yields a similar query:
query = (
JobLog.objects.values(
"username", "subscriber_code", "job_number", "name", "time",
)
.filter(
~models.Q(job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)),
time__gte=start,
time__lte=end,
event="delivered",
)
)
This attempt with the Q
object yields the following SQL, again, without the NOT IN
:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log" WHERE (
NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished'
AND U0."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
AND "view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
Is there any way to get Django's ORM to do something equivalent to AND job_number NOT IN (12345, 12346, 12347)
? Or am I going to have to drop to raw SQL to accomplish this?
Thanks in advance for reading this entire wall-of-text question. Explicit is better than implicit. :)