2

Recently I've found that Django ORM's Model.save() executes an SQL to update 'ALL' columns by default, even if nothing has been modified. This really concerns me, because any changes I made has a chance to be set back to the original value by some other Model.save() process.

For example, I have a model Order. And there are two concurrent processes (P1, P2) working on it. First of all, P1 selects a row:

# P1
order = Order.objects.get(pk=10000)

Then, P2 selects the same row and updates the status column: (the following statements can be wrapped in a transaction, or even a SERIALIZABLE one, but this can not solve the issue.)

# P2
order = Order.objects.get(pk=10000)
if order.status == UNPAID:
    order.status = PAID    # update the `status` column
    order.save()

After that, P1 updates some other trivial column:

# P1
order.xxx = xxx    # update some other trivial column
order.save()    # This would set the `status` back to UNPAID !!!

order.status would be set back to UNPAID, and that is NOT what I want.

I know I can use save(update_fields=...), select_for_update(), filter(...).update(...), SERIALIZABLE transaction, or explicit locking on P1 to prevent this issue. But the thing is: It's ridiculous to use them on all Model.save() statements in the whole project. Furthermore, even if I do in my project code, there's some other code doing this (Django Admin, ModelForm...).

Should I rewrite Model.save() to update only those modified fields?

(It seems like a severe problem to me. Or have I taken something wrong?)

prajnamort
  • 21
  • 1
  • It happens because `P1` has no idea that you changed `P2`'s `status` attribute. Why do you need 2 different references to the same `order`? – DeepSpace Jul 31 '16 at 12:24
  • @DeepSpace I would think `P1` and `P2` are different processes. – thebjorn Jul 31 '16 at 13:00
  • @DeepSpace Yes, they are different processes. For example, a staff is using Django Admin to modify that order (`P1`), and a customer is making a payment (`P2`). – prajnamort Jul 31 '16 at 13:07
  • 1
    So, this is a race condition, but you've already stated the various approaches for fixing this. – Daniel Roseman Jul 31 '16 at 13:16

2 Answers2

0

As other folks said in comments, you have classic type of race condition. There are some ways of dealing this in Django. You've already mention some of this, but they can't be considered safety in my experience. I highly suggest you to watch this video called Immutable Django and start digging in this direction.

Basically you can add some kind of version column in your model and increment it in every save and compare to existing one. If new version is same or less then already saved one - you trying to save unactual entry.

Community
  • 1
  • 1
valignatev
  • 6,020
  • 8
  • 37
  • 61
0

You only need one column for a similar race condition to happen:

# Process 1
account = Account.objects.get(pk=1)
account.balance += 1000

# Process 2
account = Account.objects.get(pk=1)
account.balance -= 10
account.save()

# Process 1
account.save()

Now the balance deduction from process 2 is lost. You must use select_for_update to ensure your writes are consistent. Even if you only save the "dirty" fields, and make that the default, this will still happen.

Koterpillar
  • 7,883
  • 2
  • 25
  • 41