1

You can query Django's JSONField, either by direct lookup, or by using annotations. Now I realize if you annotate a field, you can all sorts of complex queries, but for the very basic query, which one is actually the preferred method?

Example: Lets say I have model like so

class Document(models.Model):
    data = JSONField()

And then I store an object using the following command:

>>> Document.objects.create(data={'name': 'Foo', 'age': 24})

Now, the query I want is the most basic: Find all documents where data__name is 'Foo'. I can do this 2 ways, one using annotation, and one without, like so:

>>> from django.db.models.expressions import RawSQL
>>> Document.objects.filter(data__name='Foo')
>>> Document.objects.annotate(name = RawSQL("(data->>'name')::text", [])).filter(name='Foo')

So what exactly is the difference? And if I can make basic queries, why do I need to annotate? Provided of course I am not going to make complex queries.

darkhorse
  • 8,192
  • 21
  • 72
  • 148

2 Answers2

2

There is no reason whatsoever to use raw SQL for queries where you can use ORM syntax. For someone who is conversant in SQL but less experienced with Django's ORM, RawSQL might provide an easier path to a certain result than the ORM, which has its own learning curve.

There might be more complex queries where the ORM runs into problems or where it might not give you the exact SQL query that you need. It is in these cases that RawSQL comes in handy – although the ORM is getting more feature-complete with every iteration, with

  • Cast (since 1.10),
  • Window functions (since 2.0),
  • a constantly growing array of wrappers for database functions
  • the ability to define custom wrappers for database functions with Func expressions (since 1.8) etc.
Endre Both
  • 5,540
  • 1
  • 26
  • 31
  • 1
    This does not answer the question, and there are absolutely reasons to use `RawSQL` in many cases. – darkhorse Mar 17 '19 at 20:12
  • I tried to answer in the first sentence, sorry if I wasn't clear: There is *no reason whatsoever to use RawSQL in your example*. As to the necessity of using RawSQL in more complex cases, I did acknowledge this in my second paragraph. – Endre Both Mar 17 '19 at 20:15
  • Fair enough I guess, however I was hoping for some lower level differences between the 2 query methods. – darkhorse Mar 17 '19 at 20:16
  • You can easily inspect the 'lower level' by [asking Django for the SQL of a queryset](https://stackoverflow.com/questions/54780414/django-1-11-20-get-count-of-rows-per-iso-week-and-year#answer-54781620) or by [displaying the last executed SQL query](https://stackoverflow.com/questions/51362648/print-sql-queries-in-jupyter-notebook-with-django-extensions-plugin/54632855#answer-54632855). In your example, there are only minor syntactical differences between the two queries. – Endre Both Mar 17 '19 at 20:25
1

They are interchangable so it's matter of taste. I think Document.objects.filter(data__name='Foo') is better because:

  • It's easier to read
  • In the future, MariaDB or MySql can support JSON fields and your code will be able to run on both PostgreSQL and MariaDB.
  • Don't use RawSQL as a general rule. You can create security holes in your app.
jozo
  • 4,232
  • 1
  • 27
  • 29