"Read skew" vs "Non-repeatable read" (Transaction)

Question

I read A beginner’s guide to Read and Write Skew phenomena and A beginner’s guide to Non-Repeatable Read anomaly below to know what read skew and non-repeatable read are.

Read skew:

Non-repeatable read:

But, I cannot differentiate between read skew and non-repeatable read and basically, it seems like both can be prevented with REPEATABLE READ or SERIALIZABLE isolation level.

My questions:

What is the difference between read skew and non-repeatable read?
Can read skew and non-repeatable read be both prevented by REPEATABLE READ or SERIALIZABLE?

Super Kai - Kazuya Ito · Answer 1 · 2022-12-01T14:30:47.877

Non-repeatable read(fuzzy read) is that a transaction reads the same row at least twice but the same row's data is different between the 1st and 2nd reads because other transactions update the same row's data and commit at the same time(concurrently). *I explain more about non-repeatable read in What is the difference between Non-Repeatable Read and Phantom Read?
Read skew is that with two different queries, a transaction reads inconsistent data because between the 1st and 2nd queries, other transactions insert, update or delete data and commit. Finally, an inconsistent result is produced by the inconsistent data.

There are the examples of read skew below which you can do with MySQL and 2 command prompts.

For these examples of read skew below, you need to set READ COMMITTED isolation level to occur read skew:

SET GLOBAL TRANSACTION ISOLATION LEVEL READ COMMITTED;
SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED;

And, there is bank_account table with id, name and balance as shown below.

`bank_account` table:

id	name	balance
1	John	600
2	Tom	400

These steps below shows read skew. *300 is transferred from John's balance to Tom's balance. Then from Tom's balance, T1 reads 100(Inconsistent data) instead of 400. Finally, 600 + 100 = 700(Inconsistent result) in the total of John's and Tom's balances:

Flow	Transaction 1 (T1)	Transaction 2 (T2)	Explanation
Step 1	`BEGIN;`		T1 starts.
Step 2		`BEGIN;`	T2 starts.
Step 3	`SELECT balance FROM bank_account WHERE id = 1;` 600		T1 reads `600`.
Step 4		`UPDATE bank_account set balance = 900 WHERE id = 1;`	T2 updates `600` to `900` because `300` is transferred from Tom's balance.
Step 5		`UPDATE bank_account set balance = 100 WHERE id = 2;`	T2 updates `400` to `100` because `300` is transferred to John's balance.
Step 6		`COMMIT;`	T2 commits.
Step 7	`SELECT balance FROM bank_account WHERE id = 2;` 100		T1 reads `100(Inconsistent data)` instead of `400` after T2 commits.
Step 8	600 + 100 = 700		Finally, T1 gets `700(Inconsistent result)` instead of `1000`. *Read skew occurs.
Step 9	`COMMIT;`		T1 commits.

These steps below also shows read skew. *300 is withdrawn from Tom's balance. Then, T1 reads 100(Inconsistent data) instead of 400. Finally, 600 + 100 = 700(Inconsistent result) in the total of John's and Tom's balances:

Flow	Transaction 1 (T1)	Transaction 2 (T2)	Explanation
Step 1	`BEGIN;`		T1 starts.
Step 2		`BEGIN;`	T2 starts.
Step 3	`SELECT balance FROM bank_account WHERE id = 1;` 600		T1 reads `600`.
Step 4		`UPDATE bank_account set balance = 100 WHERE id = 2;`	T2 updates `400` to `100` because `300` is withdrawn from Tom's balance.
Step 5		`COMMIT;`	T2 commits.
Step 6	`SELECT balance FROM bank_account WHERE id = 2;` 100		T1 reads `100(Inconsistent data)` instead of `400` after T2 commits.
Step 7	600 + 100 = 700		Finally, T1 gets `700(Inconsistent result)` instead of `1000`. *Read skew occurs.
Step 8	`COMMIT;`		T1 commits.

In addition, this is also the example of read skew. There are teacher and student tables with id, name as shown below.

`teacher` table:

id	name
1	John
2	Tom

`student` table:

id	name
1	Anna
2	Sarah
3	David
4	Mark
5	Kai

These steps below shows read skew. *Lisa, Peter and Roy are inserted to "student" table. Then, T1 reads 8(Inconsistent data) instead of 5. Finally, 2 + 8 = 9(Inconsistent result) in the total of teachers and students:

Flow	Transaction 1 (T1)	Transaction 2 (T2)	Explanation
Step 1	`BEGIN;`		T1 starts.
Step 2		`BEGIN;`	T2 starts.
Step 3	*`SELECT count() FROM teacher;`** 2		T1 reads `2`.
Step 4		`INSERT INTO student values (6, 'Lisa'), (7, 'Peter'), (8, 'Roy');`	T2 inserts `Lisa`, `Peter` and `Roy`.
Step 5		`COMMIT;`	T2 commits.
Step 6	*`SELECT count() FROM student;`** 8		T1 reads `8(Inconsistent data)` instead of `5` after T2 commits.
Step 7	2 + 8 = 10		Finally, T1 gets `10(Inconsistent result)` instead of `7`. *Read skew occurs!!
Step 8	`COMMIT;`		T1 commits.

These tables below are my experiment results of read skew with each isolation level in MySQL and PostgreSQL. *Yes means Occurs, No means Doesn't occur.

MySQL:

Isolation level	Read skew
READ UNCOMMITTED	Yes
READ COMMITTED	Yes
REPEATABLE READ	No
SERIALIZABLE	No

PostgreSQL:

Isolation level	Read skew
READ UNCOMMITTED	Yes
READ COMMITTED	Yes
REPEATABLE READ	No
SERIALIZABLE	No

egasimov · Answer 2 · 2022-10-03T18:41:37.627

1. What is the difference between read skew and non-repeatable read?

We have two data - let x and y, and there is a relation between them.(e.g parent/child)

Transaction T1 reads x, and then a second transaction T2 updates x and y to new values and commits. If now T1 reads y, it may see an inconsistent state, and therefore produce an inconsistent state as output.

Acceptable consistent states:

x and y

*x and *y

Note: * denotes the updated value of the variable

When x and y are the same data, meaning to read them, need to execute the same query. I guess, it leads to the problem of non-repeatable.

IMHO, even if we may call read skew is a generalization form of a non-repeatable problem.

2. Can read skew and non-repeatable read be both prevented by REPEATABLE READ or SERIALIZABLE?

Serializable isolation level permits transactions to run concurrently, it creates the effect that transactions are running in serial order: read skew/non-repeatable prevented

Repeatable read isolation level guarantees that each transaction will return the same row regardless of how many times executed. From the definition, it seems read-skew may not be prevented. Without knowing how it is implemented, it is hard to claim anything.

Today, DBMS engines use different approaches-concurrency control strategies to implement REPEATABLE Isolation level.

e.g Postgres use database snapshot(consistent view) to implement REPEATABLE READ Isolation level: it will prevent read skew

Other engines may use lock-based concurrency control mechanisms to implement it. - may not prevent read skew.

"Read skew" vs "Non-repeatable read" (Transaction)

2 Answers2

bank_account table:

*Read skew occurs.

*Read skew occurs.

teacher table:

student table:

*Read skew occurs!!

`bank_account` table:

`teacher` table:

`student` table: