0

Models.py:

class ScoringModel(models.Model):
    title = models.CharField(max_length=64)


class PredictedScore(models.Model):
    job = models.ForeignKey('Job')
    candidate = models.ForeignKey('Candidate')
    model_used = models.ForeignKey('ScoringModel')
    score = models.FloatField()
    created_at = models.DateField(auto_now_add=True)
    modified_at = models.DateTimeField(auto_now=True)

serializers.py:

class MatchingJobsSerializer(serializers.ModelSerializer):
    job_title = serializers.CharField(source='job.title', read_only=True)
    class Meta:
        model = PredictedScore
        fields = ('job', 'job_title', 'score', 'model_used', 'candidate')

To fetch the top 3 jobs, I tried the following code:

queryset = PredictedScore.objects.filter(candidate=candidate)
jobs_serializer = MatchingJobsSerializer(queryset, many=True)
jobs = jobs_serializer.data
top_3_jobs = heapq.nlargest(3, jobs, key=lambda item: item['score'])

Its giving me top 3 jobs for the whole set which contains all the models. I want to fetch the jobs with top 3 scores for a given candidate for each model used. So, it should return the top 3 matching jobs with each ML model for the given candidate.

I followed this answer https://stackoverflow.com/a/2076665/2256258 . Its giving the latest entry of cake for each bakery, but I need the top 3. I read about annotations in django ORM but couldn't get much about this issue. I want to use DRF serializers for this operations. This is a read only operation.

I am using Postgres as database.

What should be the Django ORM query to perform this operation?

amulya349
  • 1,210
  • 2
  • 17
  • 26
  • You're getting downvoted because you haven't shown us what you've tried. Per the guidance on [how to write a good question](https://stackoverflow.com/help/how-to-ask), what you've given isn't a problem, it's just asking others to do the work for you. If you present a problem that you can't make work, that meets the MVCE requirements mentioned there, then we'll be able to help. – Withnail Nov 02 '17 at 11:54
  • @Withnail Thanks for letting me know why I got downvote. Now I have added what I have tried so far. – amulya349 Nov 02 '17 at 13:48

1 Answers1

1

Make the database do the work. You don't need annotations either as you want the objects, not the values or manipulated values.

To get a set of all scores for a candidate (not split by model_used) you would do:

queryset = candidate.property_set.filter(candidate=candidate).order_by('-score)[:2]
jobs_serializer = MatchingJobsSerializer(queryset, many=True)
jobs = jobs_serializer.data

What you're proposing isn't particularly well suited in the Django ORM, annoyingly - I think you may need to make separate queries for each model_used. A nicer solution (untested for this example) is to hook Q queries together, as per this answer.

Example is there is tags, but I think holds -

#lets get a distinct list of the models_used - 
all_models_used = PredictedScore.objects.values('models_used').distinct()
q_objects = Q() # Create an empty Q object to start with

for m in all_models_used:
    q_objects |= Q(model_used=m)[:3] # 'or' the Q objects together

queryset = PredictedScore.objects.filter(q_objects) 
Withnail
  • 3,128
  • 2
  • 30
  • 47