2

I am using Django to display a series of test results. I created a view to show the most recent instance of each test result.

1: results = resultTable.objects.order_by('id')
2: latestResults = {}
3: for res in results: latestResults[res.testcase.name] = res.id
4: latestResultQuerySet = resultTable.objects.filter(id__in=latestResults.values())
5: del latestResults
6: return HttpResponse()

Line 1 creates a QuerySet of about 20,000 entries in about 1 second.

Line 3 calculates the id of the latest results of each testcase in about 12 seconds.

Line 4 create a new QuerySet of about 800 entries using those ids in about 1 second.

Line 5 deletes the dictionary in essentially no time.

From this I would think the (blank) page would load in about 15 seconds. In practice it take about 3 minutes to load.

Removing line 3 allows the page to load instantly.

Any ideas what is causes the extra 2+ minutes of delay? Since I am not passing anything to the HttpResponse I assumed it had something to do with the garbage collection of the dictionary, thus line 5 was born.

Ken D
  • 21
  • 1
  • Check the logic. You are just doing the exact same thing - Get all the ids ordered by id in the 2 queries. – karthikr Jan 30 '14 at 17:30
  • karthikr - The the queries aren't the same. The second query is just getting the distinct subset of the first. – Ken D Jan 30 '14 at 18:35

2 Answers2

1

My guess is you are having an n + 1 issue

If so you are doing 20,000 queries as you loop through the result set.

you can use select_related to avoid this:

results = resultTable.objects.select_related('testcase').order_by('id')
Community
  • 1
  • 1
WayneC
  • 5,569
  • 2
  • 32
  • 43
  • Good but not great. This reduced the action of line 3 to 1.8 seconds but the page load is still over a minute. – Ken D Jan 30 '14 at 18:32
0

If I can second guess the logic from your code, you are trying to find latest ids with distinct values of testcase.name. The only way to improve performance is to avoid building a huge dictionary in memory from a queryset. So here is an attempt to turn it into a simple query:

latestResultQuerySet = resultTable.objects.order_by('testcase__name', 'id').distinct('testcase__name')
arocks
  • 2,862
  • 1
  • 12
  • 20