1

I have tried different ways of querying my data, but still results in a large amount of pings to the DB.

I have tried using select_related.

Here are my models:

class Order(models.Model):
    num = models.CharField(max_length=50, unique=True)

class OrderInfo(models.Model):
    info = models.CharField(max_length=100, unique=True)
    datetime = models.DateTimeField(auto_now_add=True, blank=True)
    order_fk = models.ForeignKey(Order)

What I am trying to achieve:

OrderInfo has a bunch of information pertaining to the Order.

What I want is to be able to get the most recent OrderInfo from the DB, but I want it unique to Order. The unique part is where I am struggling to minimize my query amount.

ois = OrderInfo.objects.order_by('-datetime').select_related('order_fk')

When I try to filter it is doing a query on each Order to check for uniqueness the queries ramp up.

For instances:
_ = [oi.order_fk for oi in ois] # queries reach to 20k, takes too long.

Also, then I just need to limit how many responses I get, but I need to know how many unique Orders there are first in order to limit it.

Anyone, know a proper approach to minimize these queries or possibly I may need to restructure my models.

Notes:

  • Django 1.7
  • Python 2.7
  • SQLite
DKJ
  • 29
  • 5
  • Could you tack on `.query` to the end of your query (ois) and provide the result? You can post the results on a site like http://www.poorsql.com to avoid character limit here and provide formatting. In general, this will post out the query as Django is creating it and we can leverage that to try and see what exactly is causing it to query so much. Also, how many orders are there in your database? How many info objects are there in your database? – aredzko Sep 20 '16 at 20:43
  • @tredzko ois when printed is 1 query. `goo.gl/czqYru`. When I do the for loop it is doing that query every time. There are about 4k `Orders` and about 35k `OrderInfos`. PS: Won't me let paste short link and original link is extremely long. – DKJ Sep 20 '16 at 20:54
  • Alright, and to be clear, you're looking for `OrderInfo` objects, the most recent, regardless of `Order`, but only once per `Order`, is that correct? I'm not sure I'll be able to help with reducing this with the way the models are set up. – aredzko Sep 20 '16 at 21:05
  • @tredzko Correct. I want in order of most recent and only allowed 1 Order. Do you have a suggestion on how to properly model this? – DKJ Sep 20 '16 at 21:11
  • I replied with an answer, please reply to it to let me know if it works or not, or if it helps you come up with an answer. – aredzko Sep 20 '16 at 21:34
  • I don't understand why `[oi.order_fk for oi in ois]` causes to many queries. Are you sure you're passing the right lookup to `select_related`? (Note that django <1.8 does not validate arguments to `select_related`.) Does the code error if you replace `select_related` with `prefetch_related`? – Lucas Wiman Sep 21 '16 at 05:19

2 Answers2

1

How many OrderInfo objects are there per order? If it's small, it is probably easiest to use prefetch_related and just do the filtering in python:

class Order(models.Model):
    num = models.CharField(max_length=50, unique=True)
    @property
    def latest_order_info(self):
        return max(self.orderinfo_set.all(), key=attrgetter('datetime')

Then in your application code, you can do:

orders = Order.objects.filter(...).prefetch_related('orderinfo_set')

It's a little wasteful, but in my experience is usually not a bottleneck unless the parent model has a very large number of children.

Lucas Wiman
  • 10,021
  • 2
  • 37
  • 41
  • 1
    `orders = Order.objects.filter(...).prefetch_related('orderinfo_set')` this right here actually did it for me. Along with limiting number of available help cut down on the number of queries it had to do. – DKJ Sep 21 '16 at 19:11
0

Based on this stack overflow answer to a similar question, I think you'd just need to split this up into two queries to hopefully reduce how much you have to peg the database.

Since you're using SQLite, you do not have access to DISTINCT ON, so perhaps try the following:

from django.db.models import Q, Max
import operator

ois = OrderInfo.objects.values('order_id').annotate(max_datetime=Max('datetime'))
filters = reduce(operator.or_, [(Q(order_id=oi['order_id']) &
    Q(datetime=oi['max_datetime'])) for oi in ois])
filtered_ois = OrderInfo.objects.filter(filters)

To explain what this is doing (again, this is based off of the linked stack overflow answer):

  • Get the OrderInfo objects grouped by their order_id and annotate those with the maximum datetime among all of the OrderInfo objects that have that order_id.
  • Use these values to create a new filter on the OrderInfo objects.
  • Use the filter on OrderInfo objects and return them.
Community
  • 1
  • 1
aredzko
  • 1,690
  • 14
  • 14
  • I changed `order_id` to `order_fk_id` to make it work. However, when I try to print out `filtered_ois` I get the following `OperationalError: Expression tree is too large (maximum depth 1000) `. I am currently trying to play with it to get this working. – DKJ Sep 20 '16 at 21:50
  • @DKJ I wish you luck. Let me know what you find! – aredzko Sep 20 '16 at 22:00
  • The sql it is trying to execute. It is too large for it to be processed. Not sure how to shorten it. I have already tried cutting down on ois and the returned value of `OrderInfo.objects.filter(filters)`. – DKJ Sep 20 '16 at 22:58