12

The documentation for Django 2.2, which I'm using, gives the following example usage for select_for_update:

from django.db import transaction

entries = Entry.objects.select_for_update().filter(author=request.user)
with transaction.atomic():
    for entry in entries:
        ...

Using this approach, one would presumably mutate the model instances assigned to entry and call save on these.

There are cases where I'd prefer the alternative approach below, but I'm unsure whether it would work (or even make sense) with select_for_update.

with transaction.atomic():
    Entry.objects.select_for_update().filter(author=request.user).update(foo="bar", wobble="wibble")

The documentation states that the lock is created when the queryset is evaluated, so I doubt the update method would work. As far as I'm aware update just performs an UPDATE ... WHERE query, with no SELECT before it. However, I would appreciate it if someone more experienced with this aspect of the Django ORM could confirm this.

A secondary question is whether a lock even adds any protection against race conditions if one makes a single UPDATE query against the locked rows. (I've entered this train of thought because I'm refactoring code that uses a lock when updating the values of two columns of a single row.)

Super Kai - Kazuya Ito
  • 22,221
  • 10
  • 124
  • 129
Simon Crowe
  • 301
  • 3
  • 14
  • Where you'd prefer a different approach, have you considered [bulk_update](https://docs.djangoproject.com/en/3.2/ref/models/querysets/#bulk-update)? – markwalker_ Jul 27 '21 at 13:35
  • @markwalker_ thanks, that's a good suggestion., I imagine `bulk_update` would probably be more efficient as it would only use one UPDATE query rather than one per row. – Simon Crowe Jul 27 '21 at 14:10
  • The use case I was thinking of ways to update a single row and lock it using select_for_update. At the time I wrote the question I didn't realize that `qs.get(foo="bar")` would cause a Queryset based on`qs` to be evaluated, meaning that at least I wouldn't have to iterate through a QuerySet containing a single result. – Simon Crowe Jul 27 '21 at 14:13

2 Answers2

11

As far as I'm aware update just performs an UPDATE ... WHERE query, with no SELECT before it

Yes, that's correct. You could confirm this by looking at the actual queries made. Using the canonical django tutorial "polls" app as an example:

with transaction.atomic():
    qs = polls.models.Question.objects.select_for_update().all()
    qs.update(question_text='test')

print(connection.queries)
# {'sql': 'UPDATE "polls_question" SET "question_text" = \'test\'', 'time': '0.008'}

So, as you expect, there is no SELECT.

Though, ensuring the lock is acquired would be as simple as doing anything to cause the queryset to be evaluated.

with transaction.atomic():
    qs = polls.models.Question.objects.select_for_update().all()
    list(qs) # cause evaluation, locking the selected rows
    qs.update(question_text='test')

print(connection.queries)
#[...
# {'sql': 'SELECT "polls_question"."id", "polls_question"."question_text", "polls_question"."pub_date" FROM "polls_question" FOR UPDATE', 'time': '0.003'},
# {'sql': 'UPDATE "polls_question" SET "question_text" = \'test\'', 'time': '0.001'}
#]

A secondary question is whether a lock even adds any protection against race conditions if one makes a single UPDATE query against the locked rows

In general, yes. Whether it is necessary in a particular situation depends what kind of race condition you're worried about. The lock will prevent race conditions where another transaction may try to update the same row, for example.

Race conditions can be avoided without locks, too, depending on the nature of the update/race condition. Sometimes a transaction is sufficient, sometimes it's not. You may also use expressions which are evaluated server-side on the db to prevent race conditions (e.g. using Django's F() expressions).

There are also other considerations, like your db dialect, isolation levels, and more.

Additional reference on race condition thoughts: PostgreSQL anti-patterns: read-modify-write cycles (archive)

sytech
  • 29,298
  • 3
  • 45
  • 86
  • 2
    Thanks for the detailed answer. For some reason, it didn't occur to me to look at the SQL emitted by Django. – Simon Crowe Jul 27 '21 at 08:35
  • But with a `.filter` there may be a sub-select that does have `for update`. – Andrew Jun 09 '22 at 00:29
  • @Andrew I'm not sure that's quite true. I'm also not sure how that impacts this question/answer. Maybe you could elaborate or address this in a question of your own. – sytech Jun 09 '22 at 06:35
  • I think if the `.filter` involves related models the query will have a sub-select (with a join) and that sub-select will get `for update` – Andrew Jun 09 '22 at 13:28
1

To run SELECT FOR UPDATE, you need to use select_for_update() and update() separately then put print(qs) between them as shown below. *You can also put bool(qs), len(qs) or list(qs) instead of print(qs):

qs = Entry.objects.select_for_update().filter(author=request.user)
print(qs) # bool(qs), len(qs) or list(qs) is also fine
qs.update(foo="bar", wobble="wibble")

You can see my question and answer explaining more about select_for_update() in Django:

Super Kai - Kazuya Ito
  • 22,221
  • 10
  • 124
  • 129