0

I noticed that concurrent execution of simple and identical queries similar to

BEGIN;  
SELECT files.data FROM files WHERE files.file_id = 123 LIMIT 1 FOR UPDATE;
UPDATE files SET ... WHERE files.file_id = 123;
COMMIT;

lead to deadlock which is surprising to me since it looks like such queries should not create a deadlock. Also: it is usually takes only milliseconds to complete such request. During such deadlock situation if I run:

 SELECT blockeda.pid AS blocked_pid, blockeda.query as blocked_query, 
 blockinga.pid AS blocking_pid, blockinga.query as blocking_query FROM pg_catalog.pg_locks blockedl
 JOIN pg_stat_activity blockeda ON blockedl.pid = blockeda.pid
 JOIN pg_catalog.pg_locks blockingl     ON(blockingl.transactionid=blockedl.transactionid
 AND blockedl.pid != blockingl.pid)
 JOIN pg_stat_activity blockinga ON blockingl.pid = blockinga.pid
 WHERE NOT blockedl.granted;

I see both of my identical select statements listed for blocked_pid and blockin_pid for whole duration of the deadlock.

So my question is: Is it normal and expected for queries that try to select same row FOR UPDATE to cause deadlock? And if so, what is the best strategy to avoid deadlocking in this scenario?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Yatima
  • 119
  • 7

1 Answers1

2

Your commands are contradicting.

If files.file_id is defined UNIQUE (or PRIMARY KEY), you don't need LIMIT 1. And you don't need explicit locking at all. Just run the UPDATE, since only a single row is affected in the whole transaction, there cannot be a deadlock. (Unless there are side effects from triggers or rules or involved functions.)

If files.file_id is not UNIQUE (like it seems), then the UPDATE can affect multiple rows in arbitrary order and only one of them is locked, a recipe for deadlocks. The more immediate problem would then be that the query does not do what you seem to want to begin with.

The best solution depends on missing information. This would work:

UPDATE files
SET    ...
WHERE  primary_key_column = (
         SELECT primary_key_column
         FROM   files 
         WHERE  file_id = 123
         LIMIT  1
     --  FOR    UPDATE SKIP LOCKED
         );

No BEGIN; and COMMIT; needed for the single command, while default auto-commit is enabled.

You might want to add FOR UPDATE SKIP LOCKED (or FOR UPDATE NOWAIT) to either skip or report an error if the row is already locked.

And you probably want to add a WHERE clause that avoids processing the same row repeatedly.

More here:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thank you for you answer @Erwin Brandstetter! The files table is indeed have multiple `UNIQUE` constraints and `SELECT` use all of them to uniquely identify single row. However the `UPDATE` statement _does not_ change any of fields that in unique constrain so i though that should make whole operation atomic. – Yatima Jan 19 '19 at 23:39
  • Also: thank you for advice on using `UPDATE` with `WHERE` i though about that but in my case update logic is considerably more complicated and involved arbitrary functions calls to calculate new file-data, so i can not put them into single SQL statement. I will try to refactor my code to use `files` primary key and see if that solve the issue. – Yatima Jan 19 '19 at 23:42
  • Hmm... i refactored my code to use primary key for selecting row (and remove `LIMIT 1` as you suggested), but that did not solve the problem: I am still getting deadlocks. Any idea what else could be at play here? Thanks, – Yatima Jan 20 '19 at 00:41
  • @Yatima: `my case ... involved arbitrary functions calls to calculate new file-data` The deadlock can happen there, too. Hard to guess without knowing the complete situation. – Erwin Brandstetter Jan 20 '19 at 02:07
  • 1
    i was able to simply this and create much simpler test case. For clarity i moved this into separate question, please see: https://stackoverflow.com/questions/54317754/deadlock-when-using-select-for-update – Yatima Jan 22 '19 at 23:08