How can I avoid this race condition?

Question

I've got a distributed task queue with a task that looks like this:

# creates a uniquely-named file
new_path = do_work()

old_path = database.query('select old path')
unlink(old_path)

database.query('insert new path')

There's a race condition here: if the task queue software sets off two of these tasks at the exact same time they'll both get the same old_path out of the database and the race loser's unlink call with fail (orphaning the loser's new path from future unlinking).

Is there a way I can structure this to get around this race? I can throw out just about anything from this current design if needed. Specifically I'm using PostgreSQL, Python and Celery. I understand that I can probably use table-wide locking/change psycopg2's transaction level to SERIALIZABLE but I'm not sure that avoids this race condition. Table-level locking also means I'd have to bring in a new table for each additional task (as to not have them block each other) which doesn't sound too appealing.

Near as I can tell old_path / new_path is a global variable you need synchronised access to. I don't think you can do much better than serializing access to it, although there might be better ways to achieve this than via a database. — millimoose, Sep 17 '12 at 20:14
They're not global variables - all the code here is within a single function scope (the task). — ldrg, Sep 17 '12 at 22:24
The extremely shortened version of my answer: Use PGQ. Queueing is hard. — Craig Ringer, Sep 17 '12 at 23:58

score 5 · Accepted Answer · edited May 23 '17 at 11:45

I strongly recommend that you look into tools that have already solved this problem, like PGQ. Queuing is way harder than you expect. This is not a wheel you want to reinvent.

Concurrency is hard

Mihai's answer looks superficially fine, but falls down somewhat in concurrent operation.

Two concurrent UPDATEs can pick the same row (which in his example has used_flag = FALSE). One of them will get the lock and proceed. The other will wait until the 1st runs and commits. When the commit occurs the second update will then get the lock, re-check its condition, find no rows match any more, and do nothing. Thus, it's possible - in fact highly likely - for all but one of a set of concurrent updates to return an empty set.

In READ COMMITTED mode you can still get decent results, roughly equivalent to a single session looping over the UPDATE continuously. In SERIALIZABLE mode it'll fail hopelessly. Try it; here's the setup:

CREATE TABLE paths (
    used_flag boolean not null default 'f',
    when_entered timestamptz not null default current_timestamp,
    data text not null
);

INSERT INTO paths (data) VALUES
('aa'),('bb'),('cc'),('dd');

and here's the demo. Try with three concurrent sessions, following along step by step. Do it once in READ COMMITTED then again with all sessions SERIALIZABLE by using BEGIN ISOLATION LEVEL SERIALIZABLE instead of plain BEGIN. Compare the results.

SESSION 1             SESSION2         SESSION 3

BEGIN;
                                       BEGIN;

UPDATE      paths
    SET     used_flag = TRUE
    WHERE   used_flag = FALSE
    RETURNING data;

                      BEGIN;

                      INSERT INTO
                      paths(data)
                      VALUES
                      ('ee'),('ff');      

                      COMMIT;               
                                       UPDATE      paths
                                           SET     used_flag = TRUE
                                           WHERE   used_flag = FALSE
                                           RETURNING data;


                      BEGIN;

                      INSERT INTO
                      paths(data)
                      VALUES
                      ('gg'),('hh');      

                      COMMIT;        

COMMIT;

In READ COMMITTED the first UPDATE succeeds and produces four rows. The second produces the remaining two ee and ff that were inserted and committed after the first update ran. gg and hh are not returned by the second update even though it actually executes after they're committed, because it's already selected its rows and is waiting on a lock by the time they're inserted.

In SERIALIZABLE isolation the 1st UPDATE succeeds and produces four rows. The second fails with ERROR: could not serialize access due to concurrent update. In this case SERIALIZABLE isolation won't help you out, it'll just change the nature of the failure.

Without the explicit transactions the same thing will happen when UPDATEs run concurrently. It's just easier to demo without fiddling with timing if you use explicit transactions.

What about picking just one row?

As above the system works OK-ish, but what if you want to only get the oldest row? Because UPDATE picks the row(s) it's going to operate on before blocking on a lock, you'll find that in any given set of transactions only one UPDATE will return a result.

You'll be thinking of tricks like:

UPDATE      paths
    SET     used_flag = TRUE
    WHERE entry_id = (
        SELECT entry_id
        FROM paths 
        WHERE used_flag = FALSE
        ORDER BY when_entered
        LIMIT 1
    )
    AND used_flag = FALSE
    RETURNING data;

or

UPDATE      paths
    SET     used_flag = TRUE
    WHERE entry_id = (
        SELECT min(entry_id)
        FROM paths 
        WHERE used_flag = FALSE
    )
    AND used_flag = FALSE
    RETURNING data;

but these won't work how you expect; when run concurrently both will pick the same target row. One will proceed, one will block on the lock till the first commits, then proceed and return an empty result. Without the second AND used_flag = FALSE I think they can even return duplicates! Try it after adding an entry_id SERIAL PRIMARY KEY column to the demo paths table above. To get them to race, just LOCK TABLE paths in a 3rd session; see the examples I give in the following links.

I wrote about these issues in another answer and in my answer to can multiple threads cause duplicate updates on a constrained set.

Seriously, go check out PGQ. It's solved this for you already.

1. thanks for the PGQ link - that will probably solve many of my problems! 2. do you think the Original Poster could do a lock table in "exclusive access" mode ? — Jonathan Vanasco, Sep 18 '12 at 00:08
@JonathanVanasco You mean `ACCESS EXCLUSIVE`, the default `LOCK TABLE` mode? Sure, but they don't want to force the transactions to execute serially, which that'd do. — Craig Ringer, Sep 18 '12 at 00:11
yes, and i misunderstood the issue. i didn't realize all the task/data was in that table. i use `ACCESS EXCLUSIVE` to block parallel transactions within a different setup. — Jonathan Vanasco, Sep 18 '12 at 00:19

score 1 · Answer 2 · answered Sep 17 '12 at 20:03

1

Instead of selecting the old path do something like this:

old_path = database.query('
    UPDATE      paths
        SET     used_flag = TRUE
        WHERE   used_flag = FALSE
        RETURNS data');

The RETURNS clause allows you to "select" values from the row you just updated (/deleted/inserted).

The used_flag specifies if that row has already been used by another Python instance. Using the WHERE used_flag = FALSE bit will make sure you're not selecting something that has already been used.

answered Sep 17 '12 at 20:03

Mihai Stancu

15,848
2
33
51

My intent was to achieve "lock & release". That way the script is notified that there are no values it can process. Since the question wasn't explicit about how he is selecting the "old path" if it's a queue or if it's selected by PK, I didn't want to presume too much about the app. – Mihai Stancu Sep 18 '12 at 07:26
I also exemplified `UPDATE` because it is atomic standalone, so for example when incrementing/decrementing a counter (ex.: reserved parking spaces), you are locking the same row for update and want it to fail if no more parking spaces are available. – Mihai Stancu Sep 18 '12 at 07:41
`UPDATE` *isn't* atomic, though. It proceeds in distinct phases. The filter is executed to select the rows the `UPDATE` will affect, then those rows are locked and finally they're updated. Races are possible between `UPDATE` statements. – Craig Ringer Sep 18 '12 at 07:47
AFAIK in MySQL UPDATE is guaranteed to be atomic in the following respects: it guarantees commit/fail of it's entire payload (no partial updates can happen); it guarantees that if you operate on the rows dataset such as `SET x = x+1, y=x/2, z=x+y` you will not end up with any garbage data, no other partial commits can taint the values of x and y before they are recalculated and committed. It is true that in the respects you are referring to about what it selects and what it ends up updating it can fail but failure is a part of the atomicity guarantee. – Mihai Stancu Sep 18 '12 at 07:55
Having a parking lot counter in a specific row somewhere and using `UPDATE parking_lot SET count=count-1 WHERE rowid = 1 AND count > 0 RETURNS TRUE` will guarantee that you will always get a TRUE returned if there still are parking slots available or nothing if there aren't. – Mihai Stancu Sep 18 '12 at 07:59
1

Sure, you have a guarantee of apparent atomicity of commit between transactions, and isolation to prevent dirty reads. These are general transactional properties, not properties of `UPDATE`. And yes, *if you specify a primary key explicitly* then concurrent updates will behave as expected. With more complex `WHERE` clauses unexpected concurrent behaviour can come into play because of the ordering of row selection, locking, changes, and transaction commit. See http://stackoverflow.com/questions/11914804/can-multiple-threads-cause-duplicate-updates-on-constrained-set – Craig Ringer Sep 18 '12 at 08:13

score 0 · Answer 3 · answered Sep 17 '12 at 20:03

0

If the task queue software is able to provide a unique identifier with the request, perhaps you can store the old_path for each request in a different row. If not, perhaps you can generate a key for each request and store the path with it.

answered Sep 17 '12 at 20:03

garyKeorkunian

421
4
7

How can I avoid this race condition?

3 Answers3

Concurrency is hard

What about picking just one row?