2

I'm wondering about the behaviour of updates and if they are reconciled between transactions:

for example, if Alice has a starting balance of 500, and two transactions have executed simultaneously (on two different connections):

UPDATE Account SET balance = balance - 100 WHERE name = 'Alice';

... long time ...

UPDATE Account SET balance = balance + 100 WHERE name = 'Bob';

and

UPDATE Account SET balance = balance - 50 WHERE name = 'Charlie';

... long time ...

UPDATE Account SET balance = balance + 50 WHERE name = 'Alice';

Will Postgres ensure that Alice's balance will always be 450? or will it sometimes be 550 or 400 depending on how the transactions have executed?


For example:

This is a big assumption but I'm assuming that for ACID, when a transaction starts, it will freeze the state of the database for that given transaction:

  1. Transaction B starts, when the balance is 500
  2. Transaction A starts before B finishes and when the balance is 500. It subtracts 100 -> Balance in A: 400.
  3. Transaction B adds 50 (in parallel to A) to the balance it sees had in (1.) -> Balance in B: 550
  4. Transaction B finishes and commits.
  5. Transaction A finishes and commits.

When the two transactions complete - A has the balance being 400 and Transaction B has the balance being 550.

So, the database needs to know that two updates have been made to the same place and reconcile that.

zcaudate
  • 13,998
  • 7
  • 64
  • 124
  • The transaction is a unit of execution. Meaning it will execute and update the data based on the value the table has at the moment the transaction started. So in your above scenario it will be always 450 for Alice row. Now think of two people accessing the exact same information at the same time and saving it. Lets say that at the moment your system receive the request both transactions are equal 500, first person withdraw 50 the other adds 50 whatever the database executes lastly will be the final value. There are techniques to prevent that of course :) – Jorge Campos Feb 12 '21 at 02:30
  • I've added a scenario to explain this occurrence. – zcaudate Feb 12 '21 at 02:50
  • Yeap, you described the concurrency scenario. Without any technique the only valid one is the last one in your case Transaction A last value so 400. Transaction A and B are agnostic of each other. They only know about the 500 and the operation it is executing at their time. Both are working with the 500 values so whatever finishes last will be the final value. For a very brief period of time the DB will have the 550 from B (though A doesn't know about it) and then change to the 400 from A. And a transaction doesn't freeze the database only the tables involved in a row level depending on the db – Jorge Campos Feb 12 '21 at 02:55
  • One of the techniques is row versioning, here is a good read: https://www.red-gate.com/simple-talk/sql/t-sql-programming/row-versioning-concurrency-in-sql-server/ it is sql server but applies to any relational db really – Jorge Campos Feb 12 '21 at 02:59
  • happy to accept that answer (and any suggestions of how to how to implement row versioning in pg). I think that I need to lock the columns in order to get it consistent. – zcaudate Feb 12 '21 at 03:03
  • Here is a suggestion on how to implement it in postgres https://www.postgresql.org/docs/9.4/explicit-locking.html if that's ok I can add all that in an answer with this link as a suggestion. the important part is to use one of the row level locks: "Row-level Lock Modes" Just understand your use case and the consequences – Jorge Campos Feb 12 '21 at 03:12
  • Here is also a good read about locks: https://stackoverflow.com/a/129397/460557 – Jorge Campos Feb 12 '21 at 03:13
  • Please make sure to read https://www.postgresql.org/docs/current/transaction-iso.html, then update your question if it's still unclear – Bergi Feb 12 '21 at 03:17

1 Answers1

1

The result will always be 450. UPDATE statements are atomic in PostgreSQL, and there can be no lost updates.

The reason is that PostgreSQL takes an exclusive lock on a row it updates, and if it has to wait to get the lock, it will re-read the row after acquiring the lock. With the default READ COMMITTED isolation level, that guarantees that the transaction will see the most recent value, and that there cannot be a race condition that loses updates. (With REPEATABLE READ, you would get a serialization error.)

The documentation describes this in no uncertain terms.

Laurenz Albe
  • 209,280
  • 17
  • 206
  • 263
  • would you know anything about how it resolves the updates? does it pick a transaction to retry or does it do some sort of branching mechanism like git? – zcaudate Feb 12 '21 at 08:02
  • okay I just reread the answer and I think I understand. How would the database resolve deadlocks? – zcaudate Feb 12 '21 at 08:03
  • I have tried to make my answer clearer. Deadlocks are resolved by a separate database component. But that's a different thing (and a different question). – Laurenz Albe Feb 12 '21 at 08:07