Let's say we have an inventory system that tracks the available number of products in a shop (quantity). So we can have something similar to this:
Id | Name | Quantity |
---|---|---|
1 | Laptop | 10 |
We need to think about two things here:
- Be sure that
Quantity
is never negative - If we have simultaneous requests for a product we must ensure valid
Quantity
.
In other words, we can have:
request1
for 5 laptops (this request will be processed onthread1
)request2
for 1 laptop (this request will be processed onthread2
)
When both requests are processed, the database should contain
Id | Name | Quantity |
---|---|---|
1 | Laptop | 4 |
However, that might not be the case, depending on how we write our code. If on our server we have something similar to this:
var product = _database.GetProduct();
if (product.Quantity - requestedQuantity >= 0)
{
product.Quantity -= requestedQuantity;
_database.Save();
}
With this code, it's possible that both requests (that are executed on separate threads) would hit the first line of the code at the exact same time.
thread1
: _database.GetProduct(); // Quantity is 10thread2
: _database.GetProduct(); // Quantity is 10thread1
: _product.Quantity = 10 - 5 = 5thread2
: _product.Quantity = 10 - 1 = 9thread1
: _database.Save(); // Quantity is 5thread2
: _database.Save(); // Quantity is 9
What has just happened? We have sold 6 laptops, but we reduced just one from the inventory.
How to approach this problem?
To ensure only positive quantity we can use some DB constraints (to imitate unsigned int).
To deal with race condition we usually use lock
, and similar techniques.
And depending on a case that might work, if we have one instance of a server...But, what should we do when we have multiple instances of the server and the server is running on multithreading environment?
It seems to me that the moment you have more than one web server, your only reasonable option for locking is the database. Why do I say reasonable? Because we have Mutex
.
A lock
allows only one thread to enter the part that's locked and the lock is not shared with any other processes.
A mutex
is the same as a lock but it can be system-wide (shared by multiple processes).
Now...This is my personal opinion, but I expect that managing Mutex
between a few processes in microservice-oriented world where a new instance of the server can spin up each second or where the existing instance of the server can die each second is tricky and messy (Do we have some Github example?).
How to solve the problem then?
- Stored procedure* - offload the responsibility to the database. Write a new stored procedure and wrap the whole logic into a transaction. Each of the servers will call this SP and we don't need to worry about anything. But this might be slow?
- SELECT ...FOR UPDATE - I saw this while I was investigating the problem. With this approach, we still try to solve the problem on 'database' level.
Taking into account all of the above, what should be the best approach to solve this problem? Is there any other solution I am missing? What would you suggest?
I am working in .NET and using EF Core with PostgreSQL, but I think that this is really a language-agnostic question and that principle for solving the issue is similar in all environments (and similar for many relational databases).