How to deal with race condition in case when it's possible to have multiple servers (and each of them can have multiple threads)

Question

Let's say we have an inventory system that tracks the available number of products in a shop (quantity). So we can have something similar to this:

Id	Name	Quantity
1	Laptop	10

We need to think about two things here:

Be sure that Quantity is never negative
If we have simultaneous requests for a product we must ensure valid Quantity.

In other words, we can have:

request1 for 5 laptops (this request will be processed on thread1)
request2 for 1 laptop (this request will be processed on thread2)

When both requests are processed, the database should contain

Id	Name	Quantity
1	Laptop	4

However, that might not be the case, depending on how we write our code. If on our server we have something similar to this:

var product = _database.GetProduct();
if (product.Quantity - requestedQuantity >= 0)
{
   product.Quantity -= requestedQuantity;
   _database.Save();
}

With this code, it's possible that both requests (that are executed on separate threads) would hit the first line of the code at the exact same time.

thread1: _database.GetProduct(); // Quantity is 10
thread2: _database.GetProduct(); // Quantity is 10
thread1: _product.Quantity = 10 - 5 = 5
thread2: _product.Quantity = 10 - 1 = 9
thread1: _database.Save(); // Quantity is 5
thread2: _database.Save(); // Quantity is 9

What has just happened? We have sold 6 laptops, but we reduced just one from the inventory.

How to approach this problem?

To ensure only positive quantity we can use some DB constraints (to imitate unsigned int).

To deal with race condition we usually use lock, and similar techniques. And depending on a case that might work, if we have one instance of a server...But, what should we do when we have multiple instances of the server and the server is running on multithreading environment?

It seems to me that the moment you have more than one web server, your only reasonable option for locking is the database. Why do I say reasonable? Because we have Mutex.

A lock allows only one thread to enter the part that's locked and the lock is not shared with any other processes.

A mutex is the same as a lock but it can be system-wide (shared by multiple processes).

Now...This is my personal opinion, but I expect that managing Mutex between a few processes in microservice-oriented world where a new instance of the server can spin up each second or where the existing instance of the server can die each second is tricky and messy (Do we have some Github example?).

How to solve the problem then?

Stored procedure* - offload the responsibility to the database. Write a new stored procedure and wrap the whole logic into a transaction. Each of the servers will call this SP and we don't need to worry about anything. But this might be slow?
SELECT ...FOR UPDATE - I saw this while I was investigating the problem. With this approach, we still try to solve the problem on 'database' level.

Taking into account all of the above, what should be the best approach to solve this problem? Is there any other solution I am missing? What would you suggest?

I am working in .NET and using EF Core with PostgreSQL, but I think that this is really a language-agnostic question and that principle for solving the issue is similar in all environments (and similar for many relational databases).

I used SELECT FOR UPDATE and it worked like magic earlier. I put it in a transaction but I did not have to. — Assil, Nov 14 '21 at 12:26
Related question: https://stackoverflow.com/q/10935850/217324 — Nathan Hughes, Nov 14 '21 at 12:56
I am tempted to vote to close this question just because you spent more than half of your wall of text explaining what "race condition" means. Anybody who is qualified to help you already knows what "race condition" means. Not closing, because I'm not a database expert. But I have a "dumb question" of my own. The database may be distributed, but your code example is written as if the application has a single connection to it. So why can't you just wrap the whole example in a transaction and then re-try until either (A) the transaction succeeds, or (B) someone else snarfs up all the laptops? — Solomon Slow, Nov 14 '21 at 13:10
P.S., While you're waiting for an answer. You might try Googling for "Distributed system" and "consistency." — Solomon Slow, Nov 14 '21 at 13:12
@NathanHughes I know that even without defining a transaction explicitly, the database is going to use an implicit transaction to wrap an arbitrary statement. I guess that your comment is related to the suggestion about stored procedure, but the idea there was to direct all requests to one central place (not to say that sometimes we can avoid the transaction). — gzsun, Nov 14 '21 at 14:30
@SolomonSlow I explained what is race condition with one sentence, but I agree that that wasn't necessary.The rest of the text goes through the example, so that it's easier to understand the use case.Otherwise, just the title would be sufficient. About your question.I can wrap the whole example in a transaction, but that just means that when the scenario described above completes (request1 for 5 laptops and request2 for 1 laptop) according to the DB we will have 9 available laptops (not 4 as it should be). But that's not valid, and the transaction alone can't detect such incorrect behavior. — gzsun, Nov 14 '21 at 14:40
Sorry to be unclear I was referring to first comment not to anything in the post. Also scope of the question could be very broad, is it limited to applications using relational databases? — Nathan Hughes, Nov 14 '21 at 14:58
@NathanHughes Oh, I didn't get initially that you were referring to the first comment. Yes. we can limit the question to relational databases. — gzsun, Nov 14 '21 at 17:26
I'm not a database guy, but what would be the point of transactions if they did not prevent _exactly_ what you are trying to prevent? I thought that opening a transaction makes a _snapshot_ of the database at a single instant in time. When you interact with the "database," you actually interact with the snapshot. Then when you try to commit, either all your changes effectively happen in a single instant, or the commit fails because somebody else committed before you, and their commit affected records (including records that you only _fetched_) that your commit depended on. Am I wrong? — Solomon Slow, Nov 14 '21 at 17:38
Does this answer your question? [How do ACID and database transactions work?](https://stackoverflow.com/questions/3740280/how-do-acid-and-database-transactions-work) — pilcrow, Nov 14 '21 at 18:44
@SolomonSlow Transaction doesn't fail because somebody else committed before you. Transaction will fail if some of your changes within the transaction don't fit the constraints of the database for example. In the example from the original comment we can have 2 transactions (one for request1 and one for request2), but both of them will successfully commit changes to DB (and those changes will lead to the invalid state due to race condition). — gzsun, Nov 14 '21 at 20:43
@pilcrow Not really, because transactions do not inherently prevent race condition. But while I was investigating something from your article, I think I figured out a third possible solution to the problem. We can set the isolation level of a transaction to be SERIALIZABLE. This level emulates serial transaction execution for all committed transactions; as if transactions had been executed one after another, serially, rather than concurrently. However, applications using this level must be prepared to retry transactions due to serialization failures and I guess that can be tricky. — gzsun, Nov 15 '21 at 05:44
Here is a really nice demo about SERIALIZABLE isolation level: https://pgdash.io/blog/postgres-transactions.html — gzsun, Nov 15 '21 at 05:45
@SolomonSlow What you described in your last comment can actually be achieved with SERIALIZABLE isolation level. However, no database (that I know) uses this isolation level by default and even if you enable it manually it comes with some consequences. I would still like to hear opinions from people that have had a chance to use something like this in real-world applications. It seems that now I have 3 different ways to solve the problem, but still not sure is there any "smarter" or "easier" way to deal with this. — gzsun, Nov 15 '21 at 05:56

score 1 · Accepted Answer · answered Nov 16 '21 at 10:09

After reading the majority of the comments let's assume that you need a solution for a relational database.

The main thing that you need to guarantee is that the write operation at the end of your code only happens if the precondition is still valid (e.g. product.Quantity - requestedQuantity).

This precondition is evaluated at the application side in memory. But the application only sees a snapshot of the data at the moment, when database read happened: _database.GetProduct(); This might become obsolete as soon as someone else is updating the same data. If you want to avoid using SERIALIZABLE as a transaction isolation level (which has performance implications anyway), the application should detect at the moment of writing if the precondition is still valid. Or said differently, if the data is unchanged while it was working on it.

This can be done by using offline concurrency patterns: Either an optimistic offline lock or a pessimistic offline lock. Many ORM frameworks support these features by default.

I really needed to become familiar with these terms and I have definitely learned something new. Thank you. — gzsun, Nov 16 '21 at 21:56
Great to hear! You're welcome. In the book "Patterns of Enterprise Application Architecture" Martin even explains what's the difference between implementing a pessimistic offline lock and using SELECT FOR UPDATE. The book is a bit old, but many things are still valid. — Oresztesz, Nov 17 '21 at 06:23

How to deal with race condition in case when it's possible to have multiple servers (and each of them can have multiple threads)

1 Answers1