PostgreSQL deadlock while trying to obtain records for update

Question

I have a multiple processes doing some work on accounts in a single table. I have implemented a query which each process uses to lock 100 random unlocked records by setting a value in lock column and returns the locked ids.

UPDATE accounts SET locked = now() WHERE account_id in
    (SELECT account_id FROM accounts
              WHERE last_account_info_fetched IS NULL AND locked IS NULL
    LIMIT 100
    FOR UPDATE)
RETURNING account_id

And the the process which got a bunch of IDs back from the first query in another transaction does an update and unlocks the record.

UPDATE accounts SET last_account_info_fetched = ?, locked = NULL WHERE account_id = ?

The problem is that when multiple processes run the first query, it causes a deadlock. How to fix this so that each process gets a different batch of ids without deadlocks?

ERROR: deadlock detected
  Detail: Process 3428 waits for AccessExclusiveLock on tuple (16865,68) of relation 10409452 of database 10221183; blocked by process 8628.
Process 8628 waits for ShareLock on transaction 27789140; blocked by process 5340.
Process 5340 waits for ShareLock on transaction 27789126; blocked by process 3428.

No, but how would I use it in this case, do you have an example? The processes are running in a cluster, and I don't know up front how many such worker processes will be there, so I can't split up the work up front. — user2288258, Oct 02 '14 at 19:01
Why are you doing it this way? Why 100 at a time, and why not do all the processing in one transaction? There are various mitigations you can try, but which ones depend on why it is designed the way it is. — jjanes, Oct 03 '14 at 16:06

score 1 · Answer 1 · answered Oct 02 '14 at 21:29

How about you add a range limit to each thread so that they don't overlap:

UPDATE accounts SET locked = now() WHERE account_id in
    (SELECT account_id FROM accounts
              WHERE last_account_info_fetched IS NULL AND locked IS NULL 
              and account_id >= 0 
              and account_id <1000 
              LIMIT 100
    FOR UPDATE)
RETURNING account_id

If you can figure out a way for each thread to have its own range, you shouldn't have this problem. You could have each thread pick a random number and then only unlock rows in that range, but would still get overlap once in a while in that scenario.

Yep, this is pretty much the only way to do it and retain concurrency without `SKIP LOCKED` or a controlling task distribution process. — Craig Ringer, Oct 03 '14 at 00:55

score 0 · Answer 2 · answered Oct 03 '14 at 00:50

You seem to be trying to implement a work queue / task queue or messaging system.

This is harder than you'd expect, as you are discovering.

I suggest using an existing well established one; look into ActiveMQ, ZeroMQ, Celery, RQ, resqueue, ... many other message and task queues.

There's work going on that should make this easier in PostgreSQL 9.5, though: the SKIP LOCKED patch, which lets you select the first n records not already locked by some other process.

score 0 · Answer 3 · edited May 23 '17 at 12:05

Here`s a good answer: Avoiding PostgreSQL deadlocks when performing bulk update and delete operations

Here`s how your query would look with the fix:

UPDATE accounts SET locked = now() WHERE account_id in
(SELECT account_id FROM accounts
          WHERE last_account_info_fetched IS NULL AND locked IS NULL 
          and account_id >= 0 
          and account_id <1000 
          ORDER BY account_id --that`s it
          LIMIT 100
    FOR UPDATE)
RETURNING account_id

PostgreSQL deadlock while trying to obtain records for update

3 Answers3