2

Good day. I was playing with READ COMMITTED isolation level in postgres and found strange behavior which doesn't follow official documentation. Let's say I have a table account(id int,name text,amount int) and two rows.

test> select * from account;                       
-[ RECORD 1 ]-------------------------
id     | 1
name   | Bob
amount | 800
-[ RECORD 2 ]-------------------------
id     | 2
name   | Bob
amount | 200

Now I start two READ COMMITTED transactions . First one executes the following query

UPDATE account set amount = 100 where id = 2; -- 1

And then second one executes this query

 UPDATE account set amount = amount+50 --2
  where name in 
      (select DISTINCT name from account group by
      name having sum(amount)>=1000); 

Now it is locked because first transaction is not committed yet. So the second transaction wants to add 50 to each account whose total amount is bigger or equals to 1000. As Bob has two accounts (800+200) then it should add 50 to each account. However , now first transaction was committed COMMIT; --1 and now Bob has 900 at total and according to Documentation Read committed transaction will

The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row

As far as I understand , second transaction will re-evaluate where condition and skip Bob's accounts. However when second transaction was committed the final rows look like this

id     | 1                                                                                            │
name   | Bob                                                                                          │
amount | 850                                                                                          │
-[ RECORD 3 ]-------------------------                                                                │
id     | 2                                                                                            │
name   | Bob                                                                                          │
amount | 150 

which means that the second transaction didn't re-evaluate search condition and applied update to rows even if they are not matched by condition. Why does it happen . Did I miss something in the documentation ?

Almas Abdrazak
  • 3,209
  • 5
  • 36
  • 80
  • HAVING clause is not WHERE clause. – pifor May 13 '20 at 18:11
  • @pifor I talked about a part when second transactions calls (where name in (...)) – Almas Abdrazak May 13 '20 at 18:15
  • The WHERE clause is a subquery for which following rule is maybe applied: _When a transaction uses this isolation level, a SELECT query (without a FOR UPDATE/SHARE clause) sees only data committed before the query began; it never sees either uncommitted data or changes committed during query execution by concurrent transaction_. – pifor May 13 '20 at 18:30

1 Answers1

1

The UPDATE in the first transaction blocks the UPDATE in the second query, but not the subselect in that query. The subselect is already done, the sum has been determined to be 1000, so the UPDATE is executed and that is blocked. The subquery is not re-evaluated when the lock is gone.

The passage you quote from the documentation is about SELECT ... FOR UPDATE (or FOR SHARE), which you don't use. It could not be used in your example, because it doesn't make sense in a query that uses aggregate functions or grouping.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • Thank you for response. I understand that subselect is not re-evaluated. However ,about the second paragraph of yours. The quote that I posted is not only about FOR UPDATE/SHARE. See , the paragraph in the postgres site starts with "UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE" and only then it says "In the case of SELECT FOR UPDATE and SELECT FOR SHARE this means it is the updated ..." so re-evaluation is applied to UPDATE as well, doesn't it ? – Almas Abdrazak May 13 '20 at 19:49
  • The blocked `UPDATE` re-reads the row after the lock is gone, yes. That is why you end up with a value of 150. But in `UPDATE account SET ... WHERE amount = 100`, the `WHERE` clause would *not* be re-evaluated. That is different from `SELECT ... FOR UPDATE`. – Laurenz Albe May 13 '20 at 20:12
  • @LaurenzAlbe, you say that "The passage you quote from the documentation is about SELECT ... FOR UPDATE", but it's don't. This passage is really about ALL update\delete\select_for_update_or_share cases: **UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands... ..condition of the command (the WHERE clause) is re-evaluated**. This is relevant to 9.5 and also to 13 version. It's still looks like a bug in postgres or (which is more likely) mistake in documentation. Can you please re-read that paragraph again and say what you think? – Alexey Ivanov Aug 17 '21 at 21:37
  • I create an issue and got clear answer from pgsql-docs. They say: "Maybe the nuance was lost and the docs could be improved, but it clearly says "...to see if the updated version of the rows STILL MATCHES the search condition" - i.e., it will happily skip a row it thought, before it started waiting, that it was going to have to update but it will not go looking for new rows that now may match the criteria. It also won't handle any inserts by the same reasoning. This is reinforced by the leading sentence: "...they will only find target rows that were committed as of the command start time."" – Alexey Ivanov Aug 17 '21 at 23:30
  • @alexeyivanov Yes, the passage is also about `UPDATE`. But the case in question is a (sub)select without `FOR UPDATE`. The `SELECT` just doesn't block, so nothing is re-evaluated. – Laurenz Albe Aug 18 '21 at 03:37
  • @LaurenzAlbe, thanks Laurenz! Now everything is clear. For me (as an oracle developer) the postgres behaviour of update was very surprising. Oracle re-evaluate all changed records, not only initially queried by update command, but postgres doesn't: https://www.postgresql.org/message-id/CAKFQuwYyZr0X8dPbknf5CDFW4-YVvH19QEXbUAGuxzRNMKnOAg%40mail.gmail.com – Alexey Ivanov Aug 18 '21 at 07:53
  • @alexeyivanov Are you saying that Oracle starts scanning the table from the start after it has waited for a lock during the scan? That would indeed be surprising. Or does it block on a locked row, even if the row does not satisfy the `WHERE` condition? That would be even more surprising. – Laurenz Albe Aug 18 '21 at 08:02
  • @LaurenzAlbe that is. oracle rescan update condition for all records after our blocked session acquire lock. this applies to both updated and added rows. here is example: https://gist.github.com/highload-russia/b6db2c3b1131f1def2c094017b21ff29 – Alexey Ivanov Aug 18 '21 at 11:33
  • @LaurenzAlbe updated example link after update a github username: https://gist.github.com/alexey-ivanov-official/b6db2c3b1131f1def2c094017b21ff29 – Alexey Ivanov Aug 20 '21 at 20:41