2

I have the following minimal example showing my problem:

I have a Person table, which has an (auto-increment) id as primary key and some information about the Person.

I have a Widget table, which likewise has an (auto-increment) id is primary key and some non-unique other information.

I have a Person_Widget table that maps Persons to Widget. This is supposed to be a 1-to-N relation ship where N must be max-limited to 5. A person should have at most 5 widgets (and can also have zero).

How can I enforce this in a concurrency-safe way? If 2 transactions which intend to add Widgets to a Person both do a SELECT COUNT(...)... first to determine how many it might still insert, both can read 3 and determine "I can still insert two more widgets for this person", they might both insert two more, leading to a total of 7. This is a classic race condition.

Determining the current COUNT per person with a SELECT ... FOR UPDATE will not help because as determined in this thread, How do I lock on an InnoDB row that doesn't exist yet? , it is not reliable for locking non-existent rows, which will be a problem at least in the case where the Person has zero widgets before the two concurrent transactions start.

My solution: I add a num_widgets column to the person table. Before modifying Person_Widget entries for a person, a transaction must do SELECT num_widgets FROM Person WHERE Person.id=X FOR UPDATE;. The transaction will also keep this num_widges updated within the transaction so COUNT() isn't necessary anymore.

I.e. I am using the Person table as what I think is referred to as "semaphore table". Is this a bad practice? Coming from a non-database concurrency programming background, this seems a straightforward and intuitive solution, but I've hardly seen anyone talk about it in my very limited MYSQL experience.

Before determining this question to be a duplicate, please consider that the other similar questions with the problem of limiting a table to X rows per key do not consider concurrency and locking at all. It's a whole different matter when concurrecny is considered.

JMC
  • 1,723
  • 1
  • 11
  • 20
  • 1
    It would probably seem counter intuitive, and not normalised, but the *easiest* way to enforce this is to simply have 5 columns against the `Person`, for `Widget1`, `Widget2` etc. If the number of widgets allowed is going to vary frequently, or from person to person, then this may not be ideal, but if it is going to be largely static then I'd personally take the multiple column approach. – GarethD Jan 03 '23 at 10:40
  • 1
    @GarethD that sounds like a good idea. Unfortunately, in the real problem I have, N is much larger than 5. – JMC Jan 03 '23 at 11:18
  • 1
    This is NOT a duplicate, the other question doesn't even mention concurrency, and the solutions there are not concurrency-safe. The accepted answer there will not detect the inserts from the other transaction in a consistent-read environment because it's reading a snapshot from before that insert, and it will not rollback. – JMC Jan 03 '23 at 13:14
  • 1
    I am not hugely familiar with the locking mechanisms in MySQL, nor the nuances with triggers, but the problem piqued my interest, so for what it is worth [this is how I would approach the problem in SQL Server](https://dbfiddle.uk/gIk2_AG_) (which I am familiar with). I am letting a unique constraint, and a check constraint handle the data integrity, and giving each widget a sequential number (filling gaps left by deletions first). When this sequential numbers reaches the maximum allowed, any subsequent insert fails. Not sure if this translates to MySQL but it might be doable. – GarethD Jan 03 '23 at 14:39
  • 1
    There is of course an alternative viewpoint here, which is that if the maximum number of widgets per person is a business rule then the responsibility for enforcing that rule should lie within the business layer and not the data access layer, however to do this concurrently probably increases complexity and decreases efficiency for the sake of drawing an arbitrary line between business logic and data which already has a very fuzzy line between anyway. For that reason I personally don't buy into that viewpoint, but that doesn't make it any less valid, just means I don't buy into. – GarethD Jan 03 '23 at 14:54
  • 1
    @GarethD Thanks for the test code. I believe in a production setting I would agree with the alternative viewpoint of handling it in business logic. However, I just realised that in my OP solution, storing the num_widgets is completely independent from the problem and unneccessary. By just doing a `select for update` on the person row before counting and editing this Person's PersonWidgets entries, I am using the lowest number of locks (1 row), at no database storage overhead and only one single additional select query, that seems pretty optimal for MySQL+InnoDB. – JMC Jan 03 '23 at 15:52

0 Answers0