47

I want to update all rows in queryset by using annotated value.

I have a simple models:

class Relation(models.Model):
    rating = models.IntegerField(default=0)

class SignRelation(models.Model):
    relation = models.ForeignKey(Relation, related_name='sign_relations')
    rating = models.IntegerField(default=0)

And I want to awoid this code:

for relation in Relation.objects.annotate(total_rating=Sum('sign_relations__rating')):
    relation.rating = relation.total_rating or 0
    relation.save()

And do update in one SQL-request by using something like this:

Relation.objects.update(rating=Sum('sign_relations__rating'))

Doesn't work:

TypeError: int() argument must be a string or a number, not 'Sum'

or

Relation.objects.annotate(total_rating=Sum('sign_relations__rating')).update(rating=F('total_rating'))

Also doesn't work:

DatabaseError: missing FROM-clause entry for table "relations_signrelation"
LINE 1: UPDATE "relations_relation" SET "rating" = SUM("relations_si...

Is it possible to use Django's ORM for this purpose? There is no info about using update() and annotate() together in docs.

Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49
ramusus
  • 7,789
  • 5
  • 38
  • 45
  • 1
    I'm not sure this even possible in pure SQL? When doing an UPDATE in SQL, I don't think it's possible to join other tables. – gerdemb Dec 08 '10 at 13:56
  • 1
    Yes, it's possible - via subqueries, ex. `UPDATE t1 SET a = (SELECT SUM(c) from t2 where t2.b=t1.b)`; – Ivan Klass Feb 20 '15 at 15:03
  • What are you really trying to do? Are you trying to sum up all the values of column 'rating' on the 'SigningRelation' table and then save it as a new row on the 'Relation' table? – phourxx Dec 09 '15 at 20:19

6 Answers6

96

For Django 1.11+ you can use Subquery:

from django.db.models import OuterRef, Subquery, Sum

Relation.objects.update(
    rating=Subquery(
        Relation.objects.filter(
            id=OuterRef('id')
        ).annotate(
            total_rating=Sum('sign_relations__rating')
        ).values('total_rating')[:1]
    )
)

This code produce the same SQL code proposed by Tomasz Jakub Rup but with no use of RawSQL expression. The Django documentation warns against the use of RawSQL due to the possibility of SQL injection).

Update

I published an article based on this answer with more in-depth explanations: Updating a Django queryset with annotation and subquery on paulox.net

Kye
  • 4,279
  • 3
  • 21
  • 49
Paolo Melchiorre
  • 5,716
  • 1
  • 33
  • 52
  • 5
    Wow, `update` with a `Subquery` is super cool, thanks for pointing out this pattern! – RishiG Jul 31 '18 at 20:19
  • I'm happy you've found my answer useful – Paolo Melchiorre Aug 01 '18 at 06:16
  • 4
    I'd suggest using `.values_list('total_rating', flat=True)` in order to extract the value directly instead of using the `[:1]` which I didn't understand at first. – FMCorz Oct 02 '18 at 14:31
  • Thanks for your feedback. I used slicing as suggested in the official Django documentation: https://docs.djangoproject.com/en/2.1/ref/models/expressions/#limiting-the-subquery-to-a-single-row – Paolo Melchiorre Oct 02 '18 at 14:35
  • 1
    This is a wonderful snippet but I passed it through the explain and the query plan seemed slow on a pretty big table. Another option is to use the ORM to generate the subqery SQL, create a temp table with the data, and then write a manual update query. In my case it took a multi-minute update down to 1 minute. – Vinay Anantharaman Oct 24 '18 at 05:19
  • 1
    I noted this on your blog as well, there is a new package https://pypi.org/project/django-sql-utils/ that lets you do the subquery easier `Relation.objects.update(rating=SubqueryAvg('sign_relations__rating'))` – Brad Martsberger May 24 '19 at 21:01
  • How would you do it if you instead wanted to perform some kind of arithmetic on the value of rating before setting it? For example, if you wanted to set rating to (total_rating * 2) + 10 – archer Mar 30 '21 at 13:14
  • @Paolo Melchiorre No need to make subquery a join: ` Relation.objects.update( rating=SignRelation.objects.filter(relation=OuterRef('id')).annotate(total_rating=Sum('rating')).values('total_rating')[:1]) ` – Alex K Jul 06 '22 at 07:05
  • @AlexK it's true. I tested the update without the Subquery expression in Django 3.2 and 4.0 . Do you know if this is true also for older version of Django ? – Paolo Melchiorre Jul 06 '22 at 11:00
  • @PaoloMelchiorre First of all thanks for the share of knowledge. I've tested without the "subquery" in Django 1.11.29 and get an error: `ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.` By the way, do you advise any books on this topic? – mseromenho Apr 19 '23 at 12:28
7

UPDATE statement doesn't support GROUP BY. See e.g. PostgreSQL Docs, SQLite Docs.

You need someting like this:

UPDATE relation
SET rating = (SELECT SUM(rating)
              FROM sign_relation
              WHERE relation_id = relation.id)

Equivalent in DjangoORM:

from django.db.models.expressions import RawSQL

Relation.objects.all(). \
    update(rating=RawSQL('SELECT SUM(rating) FROM signrelation WHERE relation_id = relation.id', []))

or:

from django.db.models import F, Sum
from django.db.models.expressions import RawSQL

Relation.objects.all(). \
    update(rating=RawSQL(SignRelation.objects. \
                         extra(where=['relation_id = relation.id']). \
                         values('relation'). \
                         annotate(sum_rating=Sum('rating')). \
                         values('sum_rating').query, []))
Tomasz Jakub Rup
  • 10,502
  • 7
  • 48
  • 49
  • Were you able to run your last recommendation? Doesn't `.filter(relation=F('relation__pk'))` equate to `WHERE (relation.id = relation.id)` and therefore not join anything? – jnns Aug 12 '16 at 22:27
  • Thanks for updating. That `.extra()` clause is gold! – jnns Aug 13 '16 at 21:36
-1

You can define your own custom objects manager:

class RelationManager(models.Manager):
    def annotated(self,*args,*kwargs):
         queryset = super(RelationManager,self).get_queryset()
         for obj in queryset:
               obj.rating = ... do something ...
         return queryset

class Relations(models.Model):
    rating = models.IntegerField(default=0)
    rating_objects = RelationManager()

Then in your code:

q = Realation.rating_objects.annotated()

Add args/kwargs to customise what this manager returns.

Aviah Laor
  • 3,620
  • 2
  • 22
  • 27
-1

Workaround for postgres:

with connection.cursor() as cursor:
    sql, params = qs.query.sql_with_params()
    cursor.execute("""
        WITH qs AS ({})
        UPDATE foo SET bar = qs.bar
        FROM qs WHERE qs.id = foo.id
    """.format(sql), params)
Khorne
  • 9
-4

If you want to avoid many calls to the database, you should use transaction.atomic.

Read more on Django documentation: https://docs.djangoproject.com/en/1.9/topics/db/transactions/#controlling-transactions-explicitly

  • 2
    `transaction.atomic` doesn't reduce the number of calls, it merely enforces that the whole transaction either succeeds completely or fails without modifying anything. – F.X. Aug 30 '16 at 09:09
-5

You really can't do this. Take a look at the code for update and follow it through for some fine reading.

Honestly, what's wrong with placing something like this in a Manager definition? Put those 3 lines you don't want to put in your view into a manager, call that manager as necessary. Additionally, you're doing much less "magic" and when the next developer looks at your code, they won't have to resort to a few WTF's .. :)

Also, I was curious and it looks like you can use SQL Join with UPDATE statements but it's some classic SQL hackery .. So if you're so inclined, you can use Djangos raw SQL functionality for that ;)

Bartek
  • 15,269
  • 2
  • 58
  • 65
  • 4
    "Honestly, what's wrong with placing something like this in a Manager definition?" That could be a *lot* of requests sent to the DB. He specifically requested "one SQL-request". Also, I think the examples he tried are easy to read and not like magic at all. – Conley Owens May 04 '11 at 01:49
  • 1
    Recently I had to write a SQL query manually because properly written it took 200ms to execute, while when using an update look I had to wait for 30 minutes. That's the reason why. – Xowap Feb 08 '15 at 02:02