4

The Django docs for select_for_update say

Using select_for_update() on backends which do not support SELECT ... FOR UPDATE (such as SQLite) will have no effect. SELECT ... FOR UPDATE will not be added to the query, and an error isn’t raised if select_for_update() is used in autocommit mode.

This strikes me as an odd and potentially dangerous decision, especially since select_for_update is used to lock rows. If I write code that uses select_for_update, I would rely on it actually being honored! If the DB backend doesn't support it, I would expect Django to either fall back to a safe-but-less-efficient alternative or, if one doesn't exist, to throw an exception of some kind.

In this case it seems like Django could suddenly and silently reintroduce race conditions by just ignoring select_for_update on DBs where it's not supported (such as SQLite). My intuition says Django wouldn't do that and there must be some reason why it's not needed if not supported (perhaps engines where it's not supported use complete database locking?) but I can't seem to find anything concrete in the docs to back up that theory. It doesn't seem like this issue is necessarily specific to Django, either.

This is making me very leery of using select_for_update even though it would solve some current problems nicely.

Xcelled
  • 2,084
  • 3
  • 22
  • 40
  • 1
    No on here is likely to be able to answer design decisions made by the Django team. You might be better off raising it is a bug (if you consider it one) or trying to talk to them directly. – Shadow Oct 26 '17 at 04:21
  • @Shadow My hope with this question is that if no one can answer the Django part, maybe they can at least answer the SQL DB part. Based on that I could follow up with either a bug or a documentation request to Django. – Xcelled Oct 26 '17 at 04:26

2 Answers2

4

With database engines that allow to reduce transaction isolation in order to improve the speed of concurrent accesses (e.g., PostgreSQL, Oracle, and MySQL), SELECT FOR UPDATE is used to tell the database that rows that are read now will likely be written to later. This can avoid inconsistent data showing up in concurrent transactions, or even prevent deadlocks in some situations.

In SQLite, all transactions are serializable, i.e., it behaves as if the entire database is locked around each transaction. (And in autocommit mode, each statement is wrapped in an implicit transaction.)

So SELECT FOR UPDATE, even if it were implemented, would not actually add any more locking than is already there. Ignoring it is the right thing to do for SQLite.

CL.
  • 173,858
  • 17
  • 217
  • 259
  • 1
    That's what I thought might be happening. If, in a more general sense, all databases that don't support SELECT FOR UPDATE automatically apply coarser locks, then it makes sense for Django to ignore it if not supported, as the transaction is locked at a higher level. It'd be nice of they docs said that, though... – Xcelled Oct 26 '17 at 14:39
  • 1
    Pursuing that line of thinking led me to [this answer](https://stackoverflow.com/a/16425603/1163742) which gives a breakdown of how different databases deal with locks. Of note is the poster mentions explicitly that postgres, oracle, and mySQL are engines that employ a locking strategy that SELECT FOR UPDATE is applicable to. This list exactly matches the Django doc's list of backends that support `select_for_update`, so this is probably the reason. If you agree with my interpretation and update your answer to be a little more general, I'll accept it. – Xcelled Oct 26 '17 at 14:49
  • This page specifies that in SQLite the database is not locked around each transaction: https://www.sqlite.org/lockingv3.html#transaction_control. An exclusive lock is only achieved after doing a write in the transaction. – mhvis Feb 01 '19 at 16:46
  • @MaartenVisscher For serializable transactions, exclusive locks are needed only for writes. But reads still require a shared lock. – CL. Feb 01 '19 at 17:16
  • What I still don't see is how this mechanism protects against the example given in [this question](https://stackoverflow.com/q/10935850/2373688). – mhvis Feb 03 '19 at 16:53
  • 1
    @mhvis An exclusive lock is exclusive; it cannot be taken when a shared lock exists. – CL. Feb 03 '19 at 17:03
0

Think of it from this perspective.

You've just installed a library from github and want to try it out quickly using the default mysqlite settings - but you can't use it because they happen to use select_for_update in one spot. Would you rather the package worked (but with the possibility of a race condition) or explode in your face?

The Django team makes many compromises in order to get their product working in all of the different configurations it supports. Ideally, Django applications are supposed to be able to be dropped into any of them. If race-conditions become an issue for a developer, then it won't take a lot of research until they realise that they are using an unsuitable backend, and will have to switch to one that is more appropriate for the task at hand.

In other words, Django favours compatibility over functionality in this instance.

Django could suddenly and silently reintroduce by just ignoring select_for_update

It won't introduce them suddenly - changing database backends is not something one does accidentally - and in production environments it's often a non-trivial task. We're not rolling dice here. If a developer chooses to change databases then it is a good idea for them to research the pros and cons of that database - and test it.

Shadow
  • 8,749
  • 4
  • 47
  • 57
  • Flip it around, though. Yes the library works in your limited proof of concept, but when you use it in production with many users hitting your app, occasionally a race occurs and your data gets corrupted _without warning_. You didn't know to ensure your DB supports `select_for_update` because the library developer forgot and didn't put it in the readme. Now your **production** database is hosed... and you don't even know it! A Django app is nothing without its data, so doing the safe-but-inconvenient thing of _always_ blowing up is much preferred to occasionally and quietly corrupting data. – Xcelled Oct 26 '17 at 14:32
  • If anyone uses mysqlite in production - they kinda deserve it. And as I said - it's a trade off, they had to choose between reliability or compatibility. The safe-but-inconvenient thing may be your preference - but it isn't everyone's preference. I'm not going to debate with you about which is 'right', but I presented this answer because it seemed you hadn't considered both perspectives. – Shadow Oct 26 '17 at 22:04