I am not sure if what I want to do can be done.
I want to filter or exclude elements from one QuerySet object, using the annotated values of the QuerySet of another model, related to the first via FK.
models
Below an overview of the models I'm talking about:
the two models sit in my main.models.py
- a
Member
class and its Manager
from django.db import models
from django.contrib.auth.models import UserManager
class MemberManager(UserManager):
def get_queryset(self):
return MemberQuerySet(self.model, using=self._db)
def annotate_something(self):
return self.get_queryset().annotate_something_important()
class Member(User):
objects = MemberManager()
field_1 = models.BooleanField(default = False)
field_2 = models.CharField(max_length = 1, blank = True, null = True)
...
- an
Experience
class and its Manager
class ExperienceManager(models.Manager):
def get_queryset(self):
return ExperienceQuerySet(self.model, using=self._db)
def annotate_duracion(self):
return self.get_queryset().annotate_duracion()
class Experience(models.Model):
objects = ExperienceManager()
member = models.ForeignKey(Member, related_name='experiences', on_delete = models.CASCADE)
start_date = models.DateField(blank = True, null = True)
end_date = models.DateField(blank = True, null = True)
note the FK in Experience
to Member
both MemberQuerySet
and ExperienceQuerySet
classes are defined in a separate main.queryset.py
file
class MemberQuerySet(models.QuerySet):
def annotate_something_important(self):
return self.something_important
...
class ExperienceQuerySet(models.QuerySet):
def annotate_duration(self):
current_date = datetime.now().date()
return self.annotate(
end_date_obj = Case(
When(end_date = None, then = Value(current_date)),
default = F('end_date'),
output_field = DateField(),
)
).annotate(
duration = ExpressionWrapper(
F('end_date_obj') - F('start_date'),
output_field = DurationField()
)
).order_by("-end_date_obj")
the query
the starting point is a MemberQueryset
object, obtained from the following query, which can comprise some ~400k elements.
members = Member.objects.filter(
field_1 = True
).annotate_something()
The main objective of the function is to progressively filter this QuerySet, based on criteria read from views.py
the hurdle
Now comes the meaty part:
I want to exclude all those members whose last 3 experiences have duration shorter than 1 year
To achieve this, I envisioned an annotate
on members (a MemberQueryset
instance) using the related_name='experiences'
. however, I should do something like calling the annotate_duration
method in the lookup chain, something which doesn't work.
tentative solution
I resorted to the following
bad_ids = [
m.id for m in members if not all(
m.experiences.all().annotate_duration()[:3].annotate(
long_experience=Case(
When(
duration__gte=year,
then=Value(True, output_field=BooleanField())
),
default=Value(False, output_field=BooleanField())
)
).values_list('long_experience', flat=True)
)
]
members = members.exclude(id__in = bad_ids)
The problem with this approach is the for loop over members, which dramatically increases the computation time of the script, often leading to the task being killed for excessive time consumption (I believe, based on this error "Worker exited prematurely: signal 9 (SIGKILL)."
).
Does anyone know how to achieve this, avoiding the cumbersome for
loop?
resources
below a list of resources I've read
How to annotate queryset with another queryset
django model method to return queryset with annotate
Django - Query : Annotate Queryset with Related Model fields
I feel the last one might contain the elements to achieve my target, but I don't know how to use them.