10

i've got some weird query, so i have to execute raw SQL. The thing is that this query is getting bigger and bigger and with lots of optional filters (ordering, column criteria, etc.).

So, given the this query:

SELECT DISTINCT Camera.* FROM Camera c
     INNER JOIN cameras_features fc1 ON c.id = fc1.camera_id AND fc1.feature_id = 1
     INNER JOIN cameras_features fc2 ON c.id = fc2.camera_id AND fc2.feature_id = 2

This is roughly the Python code:

def get_cameras(features):
  query = "SELECT DISTINCT Camera.* FROM Camera c"
  i = 1
  for f in features:
    alias_name = "fc%s" % i
    query += "INNER JOIN cameras_features %s ON c.id = %s.camera_id AND %s.feature_id = " % (alias_name,alias_name,alias_name)
    query += " %s "
    i += 1
  return Camera.objects.raw(query, tuple(features))

This is working great, but i need to add more filters and ordering, for example suppose i need to filter by color and order by price, it starts to grow:

#extra_filters is a list of tuples like:
# [('price', '=', '12'), ('color' = 'blue'), ('brand', 'like', 'lum%']
def get_cameras_big(features,extra_filters=None,order=None):
  query = "SELECT DISTINCT Camera.* FROM Camera c"
  i = 1
  for f in features:
    alias_name = "fc%s" % i
    query += "INNER JOIN cameras_features %s ON c.id = %s.camera_id AND %s.feature_id = " % (alias_name,alias_name,alias_name)
    query += " %s "
    i += 1
  if extra_filters:
    query += " WHERE "
    for ef in extra_filters:
      query += "%s %s %s" % ef #not very safe, refactoring needed
  if order:
    query += "order by %s" % order

  return Camera.objects.raw(query, tuple(features))

So, i don't like how it started to grow, i know Model.objects.raw() returns a RawQuerySet, so i'd like to do something like this:

queryset = get_cameras( ... )
queryset.filter(...)
queryset.order_by(...)

But this doesn't work. Of course i could just perform the raw query and after that get the an actual QuerySet with the data, but i will perform two querys. Like:

raw_query_set = get_cameras( ... )
camera.objects.filter(id__in(raw_query_set.ids)) #don't know if it works, but you get the idea

I'm thinking that something with the QuerySet init or the cache may do the trick, but haven't been able to do it.

Community
  • 1
  • 1
santiagobasulto
  • 11,320
  • 11
  • 64
  • 88
  • What is "weird" about that query that you need to use raw sql? You can gradually construct a QuerySet as well. – Kekoa Feb 03 '12 at 21:34
  • I was able to [do this](https://stackoverflow.com/q/48288076/96588) easily using `ModelName.objects.filter(id__in=RawSQL("""100 line query"""))`. – l0b0 Jan 07 '19 at 00:05

4 Answers4

22

.raw() is an end-point. Django can't do anything with the queryset because that would require being able to somehow parse your SQL back into the DBAPI it uses to create SQL in the first place. If you use .raw() it is entirely on you to construct the exact SQL you need.

If you can somehow reduce your query into something that could be handled by .extra() instead. You could construct whatever query you like with Django's API and then tack on the additional SQL with .extra(), but that's going to be your only way around.

Chris Pratt
  • 232,153
  • 36
  • 385
  • 444
  • I think extra() may do the trick. Didn't know about it, thank you! – santiagobasulto Feb 03 '12 at 21:34
  • What about security with extra()? If you take a look at the query i need to pass a parameter to the JOIN. An int() comprobation just works? – santiagobasulto Feb 03 '12 at 21:40
  • 2
    Read the section in documentation of `extra` about `params`. If you use `params` Django takes care of injection just like it would with a normal queryset, but with you already using `raw` you should've been taking care of that already for any passed in parameters. – Chris Pratt Feb 03 '12 at 21:50
  • Thank you @Chris The extra thing didn't work, i'll keep doing thinks like I was. Thanks for your help! – santiagobasulto Feb 03 '12 at 21:54
2

There's another option: turn the RawQuerySet into a list, then you can do your sorting like this...

results_list.sort(key=lambda item:item.some_numeric_field, reverse=True)

and your filtering like this...

filtered_results = [i for i in results_list if i.some_field == 'something'])

...all programatically. I've been doing this a ton to minimize db requests. Works great!

Dan Ancona
  • 162
  • 7
  • 1
    I have the feeling you shouldn't iterate over the whole database. Wonder a case where n > 2^32, it would take forever. – Patrick Bassut Mar 18 '14 at 00:19
  • Depends on the size of the results, of course. In practice I've found that DB calls are relatively expensive and CPU is cheap. If the OP is indexing more than 2^32 cameras or other products... I stand entirely corrected! – Dan Ancona May 11 '15 at 21:27
  • @DanAncona I think it's exactly the opposite, db calls are cheap and CPU expensive. A DB system is designed to sort results very fast, while python, being my favorite language, can be also pretty slow. You should always avoid manipulate data in python if you can do it on data base side. DB calls are cheap, you should just avoid repetitive and duplicated calls. – Guillaume Lebreton Feb 02 '22 at 15:48
2

I implemented Django raw queryset which supports filter(), order_by(), values() and values_list(). It will not work for any RAW query but for typical SELECT with some INNER JOIN or a LEFT JOIN it should work.

The FilteredRawQuerySet is implemented as a combination of Django model QuerySet and RawQuerySet, where the base (left part) of the SQL query is generated via RawQuerySet, while WHERE and ORDER BY directives are generared by QuerySet:

https://github.com/Dmitri-Sintsov/django-jinja-knockout/blob/master/django_jinja_knockout/query.py

It works with Django 1.8 .. 1.11.

It also has a ListQuerySet implementation for Prefetch object result lists of model instances as well, so these can be processed the same way as ordinary querysets.

Here is the example of usage:

https://github.com/Dmitri-Sintsov/djk-sample/search?l=Python&q=filteredrawqueryset&type=&utf8=%E2%9C%93

Dmitriy Sintsov
  • 3,821
  • 32
  • 20
0

Another thing you can do is that if you are unable to convert it to a regular QuerySet is to create a View in your database backend. It basically executes the query in the View when you access it. In Django, you would then create an unmanaged model to attach to the View. With that model, you can apply filter as if it were a regular model. With your foreign keys, you would set the on_delete arg to models.DO_NOTHING.

More information about unmanaged models: https://docs.djangoproject.com/en/2.0/ref/models/options/#managed

Bobort
  • 3,085
  • 32
  • 43