39

What's the common way to deal with concurrent updates in an SQL database ?

Consider a simple SQL schema(constraints and defaults not shown..) like

create table credits (
  int id,
  int creds,
  int user_id
);

The intent is to store some kind of credits for a user, e.g. something like stackoverflow's reputation.

How to deal with concurrent updates to that table ? A few options:

  • update credits set creds= 150 where userid = 1;

    In this case the application retreived the current value, calculated the new value(150) and performed an update. Which spells disaster if someone else does the same at the same time. I'm guessing wrapping the retreival of the current value and update in a transaction would solve that , e.g. Begin; select creds from credits where userid=1; do application logic to calculate new value, update credits set credits = 160 where userid = 1; end; In this case you could check if the new credit would be < 0 and just truncate it to 0 if negative credits make no sense.

  • update credits set creds = creds - 150 where userid=1;

    This case wouldn't need to worry about concurrent updates as the DB takes care of the consistency problem, but has the flaw that creds would happily become negative, which might not make sense for some applications.

So simply, what's the accepted method to deal with the (quite simple) problem outlined above, what if the db throws an error ?

Madara's Ghost
  • 172,118
  • 50
  • 264
  • 308
Leeeroy
  • 1,035
  • 2
  • 12
  • 10
  • If you are concerned about violating constraints on your columns, define CONSTRAINTS in database. – jva Jul 29 '09 at 10:36

9 Answers9

32

Use transactions:

BEGIN WORK;
SELECT creds FROM credits WHERE userid = 1;
-- do your work
UPDATE credits SET creds = 150 WHERE userid = 1;
COMMIT;

Some important notes:

  • Not all database types support transactions. In particular, mysql's old default database engine (default before version 5.5.5), MyISAM, doesn't. Use InnoDB (the new default) if you're on mysql.
  • Transactions can abort due to reasons beyond your control. If this happens, your application must be prepared to start all over again, from the BEGIN WORK.
  • You'll need to set the isolation level to SERIALIZABLE, otherwise the first select can read data that other transactions have not committed yet(transactions arn't like mutexes in programming languages). Some databases will throw an error if there's concurrent ongoing SERIALIZABLE transactions, and you'll have to restart the transaction.
  • Some DBMS provide SELECT .. FOR UPDATE , which will lock the rows retreived by select until the transaction ends.

Combining transactions with SQL stored procedures can make the latter part easier to deal with; the application would just call a single stored procedure in a transaction, and re-call it if the transaction aborts.

piojo
  • 6,351
  • 1
  • 26
  • 36
bdonlan
  • 224,562
  • 31
  • 268
  • 324
  • 2
    Wouldn't you need a "select for update" in this case, or atleast a SERIALIZABLE isolation level ? – nos Jul 31 '09 at 23:49
  • 2
    @nos, it depends on the database. A database with true transaction support should provide a consistent snapshot just with a transaction, although possibly not by default. In the case of innodb, a snapshot of the database state is made as soon as you do a select on any innodb table. – bdonlan Aug 01 '09 at 03:13
  • You may indeed need to set the isolation level though: http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html – bdonlan Aug 01 '09 at 03:14
  • This can cause deadlock. If two transactions running parallel try to update same user. Both can get into race condition where each has acquired a read lock on same set of rows and now waiting other to finish, because they want to convert this to update lock. – Ankush May 09 '12 at 07:50
  • 2
    @Ankush, most SQL servers will identify deadlocks and rollback one of the txns involved to resolve the deadlock. Depending on your SQL server you can also use "SELECT ... FOR UPDATE" or similar to grab a write lock from the start. – bdonlan May 10 '12 at 05:23
  • 2
    @bdonlan True, SQL will identify deadlock and pick a victim, but this hurts overall your product performance, especially if above code lies on a critical path. – Ankush May 10 '12 at 10:43
26

For MySQL InnoDB tables, this really depends on the isolation level you set.

If you are using the default level 3 (REPEATABLE READ), then you would need to lock any row that affects subsequent writes, even if you are in a transaction. In your example you will need to :

SELECT FOR UPDATE creds FROM credits WHERE userid = 1;
-- calculate --
UPDATE credits SET creds = 150 WHERE userid = 1;

If you are using level 4 (SERIALIZABLE), then a simple SELECT followed by update is sufficient. Level 4 in InnoDB is implemented by read-locking every row that you read.

SELECT creds FROM credits WHERE userid = 1;
-- calculate --
UPDATE credits SET creds = 150 WHERE userid = 1;

However in this specific example, since the computation (adding credits) is simple enough to be done in SQL, a simple:

UPDATE credits set creds = creds - 150 where userid=1;

will be equivalent to a SELECT FOR UPDATE followed by UPDATE.

Savant Degrees
  • 380
  • 2
  • 3
17

Wrapping the code inside a transaction it's not enough in some cases regardless the isolation level you define (e.g imaging you have deployed your code into 2 different servers in production).

Let's say you have these steps and 2 concurrency threads:

1) open a transaction
2) fetch the data (SELECT creds FROM credits WHERE userid = 1;)
3) do your work (credits + amount)
4) update the data (UPDATE credits SET creds = ? WHERE userid = 1;)
5) commit

And this time line:

Time =  0; creds = 100
Time =  1; ThreadA executes (1) and creates Txn1
Time =  2; ThreadB executes (1) and creates Txn2
Time =  3; ThreadA executes (2) and fetches 100
Time =  4; ThreadB executes (2) and fetches 100
Time =  5; ThreadA executes (3) and adds 100 + 50
Time =  6; ThreadB executes (3) and adds 100 + 50
Time =  7; ThreadA executes (4) and updates creds to 150
Time =  8; ThreadB tries to executes (4) but in the best scenario the transaction
          (depending of isolation level) won't allow it and you get an error

The transaction prevents you to override the creds value with a wrong value but it's not enough because I don't want to fail any error.

I prefer instead an slower process that never fail and I solved the problem with a "database row lock" in the moment I fetch the data (step 2) that prevents other threads can read the same row until I'm done with it.

There are few ways to do in SQL Server and this is one of them:

SELECT creds FROM credits WITH (UPDLOCK) WHERE userid = 1;

If I recreate the previous time line with this improvement you get something like this:

Time =  0; creds = 100
Time =  1; ThreadA executes (1) and creates Txn1
Time =  2; ThreadB executes (1) and creates Txn2
Time =  3; ThreadA executes (2) with lock and fetches 100
Time =  4; ThreadB tries executes (2) but the row is locked and 
                   it's has to wait...

Time =  5; ThreadA executes (3) and adds 100 + 50
Time =  6; ThreadA executes (4) and updates creds to 150
Time =  7; ThreadA executes (5) and commits the Txn1

Time =  8; ThreadB was waiting up to this point and now is able to execute (2) 
                   with lock and fetches 150
Time =  9; ThreadB executes (3) and adds 150 + 50
Time = 10; ThreadB executes (4) and updates creds to 200
Time = 11; ThreadB executes (5) and commits the Txn2
Abel ANEIROS
  • 6,029
  • 2
  • 25
  • 19
6

Optimistic locking using a new timestamp column can solve this concurrency issue.

UPDATE credits SET creds = 150 WHERE userid = 1 and modified_data = old_modified_date
Youcef LAIDANI
  • 55,661
  • 15
  • 90
  • 140
Ovidiu Lupas
  • 185
  • 5
4

Table can be modified as below, introduce new field version to handle optimistic locking. This is more cost effective and efficient way to achieve better performance rather than using locks at database level

create table credits (
  int id,
  int creds,
  int user_id,
  int version
);

select creds, user_id, version from credits where user_id=1;

assume this returns creds = 100 and version=1

update credits set creds = creds*10, version=version+1 where user_id=1 and version=1;

Always this ensure that whoever is having latest version number can only updates this record and dirty writes will not be allowed

with java/hibernate supports out of the box with @version annotation on the desired field/column of a table

amar
  • 12,037
  • 1
  • 24
  • 19
3

For the first scenario you could add another condition in the where-clause to make sure you won't overwrite changes made by a concurrent user. E.g.

update credits set creds= 150 where userid = 1 AND creds = 0;
Ola Herrdahl
  • 4,216
  • 3
  • 29
  • 23
  • The missing part of this answer is that the number of records affected by the update statement needs to checked. If 0 then in a loop read the record again, attempt update, check number of records affected. Once we get 1 record affected we exit the loop. – Dave Moten Aug 27 '23 at 01:42
2

You could set up a queueing mechanism where additions to or subtractions from a rank type value would get queued up for periodic LIFO processing by some job. If real-time info on a rank's "balance" is required this wouldn't fit because the balance wouldn't compute until the outstanding queue entries are reconciled, but if it's something that doesn't require immediate reconciliation it might serve.

This seems to reflect, at least on the outside looking in, how games like the old Panzer General series handle individual moves. One player's turn comes up, and they declare their moves. Each move in turn is processed in sequence, and there are no conflicts because each move has its place in the queue.

Darth Continent
  • 2,319
  • 3
  • 25
  • 41
  • 1
    It is always a good idea to store the individual records. It is also useful if there are bugs in your code and you need to recreate the balance from the source records. It is also useful because you now have a single writer back to your summary table, which reduces contention, blocking and complexity, and makes user experience more seemless. – Doug Jul 28 '09 at 18:56
1

There are is one critical point in your case when you decrease user`s current credit field by a requested amount and if it decreased successfully you do other operations and problem is in theory there can be many parallel requests for decrease operation when for example user has 1 credits on balance and with 5 parallel 1 credit charge requests he can purchase 5 things if request will be sent exactly on the same time and you end up with -4 credits on user`s balance.

To avoid this you should decrease current credits value with requested amount (in our example 1 credit) and also check in where if current value minus requested amount is more or equal to zero:

UPDATE credits SET creds = creds-1 WHERE creds-1>=0 and userid = 1

This will guaranty that user will never purchase many things under few credits if he will dos your system.

After this query you should run ROW_COUNT() which tells if current user credit met criteria and row was updated:

UPDATE credits SET creds = creds-1 WHERE creds-1>=0 and userid = 1
IF (ROW_COUNT()>0) THEN 
   --IF WE ARE HERE MEANS USER HAD SURELY ENOUGH CREDITS TO PURCHASE THINGS    
END IF;

Similar thing in a PHP can be done like:

mysqli_query ("UPDATE credits SET creds = creds-$amount WHERE creds-$amount>=0 and userid = $user");
if (mysqli_affected_rows())
{
   \\do good things here
}

Here we used nor SELECT ... FOR UPDATE neither TRANSACTION but if you put this code inside transaction just make sure that transaction level always provides most recent data from row (including ones other transactions already committed). You also can user ROLLBACK if ROW_COUNT()=0

Downside of WHERE credit-$amount>=0 without row locking are:

After update you surely know one thing that user had enough amount on credit balance even if he tries yo hack credits with many requests but you dont know other things like what was credit before charge(update) and what was credit after charge(update).

Caution:

Do not use this strategy inside transaction level which does not provide most recent row data.

Do not use this strategy if you want to know what was value before and after update.

Just try to rely on fact that credit was successfully charged without going below zero.

BIOHAZARD
  • 1,937
  • 20
  • 23
0

If you store a last update timestamp with the record, when you read the value, read the timestamp as well. When you go to update the record, check to make sure the timestamp matches. If someone came in behind you and updated before you, the timestamps would not match.

Mark Sherretta
  • 10,160
  • 4
  • 37
  • 42
  • 3
    You get in trouble if they happen to be within the same millisecond (or whichever resolution your db have for timestamps). Which of course never happens, until one day it suddenly does and leaves your poor user with credits (or money.) lost in thin air. – nos Jul 29 '09 at 00:44