4

I'm starting to develop application that should deal with data access concurrency issues, and I'm having trouble understanding how to properly use transaction isolation levels.

I have the following table called Folders which contains a tree-like folder structure:

+-----------------------------------------------------------------+
| Id (int) | Name (varchar) | FullPath (varchar) | ParentId (int) |
|----------+----------------+--------------------+----------------|
| 1        | 'root1'        | '/root1/'          | NULL           |
| 2        | 'c1'           | '/root1/c1/'       | 1              |
| 3        | 'c2'           | '/root1/c1/c2/'    | 2              |
| 4        | 'root2'        | '/root2/'          | NULL           |
+----------+----------------+--------------------+----------------+

And I'm trying to implement the "Move folder" workflow like this (say, I want to move folder with ID=2 to a new parent with ID=4):

  1. Begin transaction
  2. Read folder with ID=2 (call it folder2): SELECT * FROM Folders WHERE Id=2
  3. Read folder with ID=4 (call it folder4): SELECT * FROM Folders WHERE Id=4
  4. Update ParentId and FullPath of folder2: UPDATE Folders SET ParentId=folder4.Id, FullPath=folder4.FullPath+folder2.Name+'/' WHERE Id = folder2.Id
  5. Read all subfolders of folder2 (call them subfoldersOfFolder2): SELECT * FROM Folders WHERE FullPath LIKE folder2.FullPath + '%'
  6. For each subfolder in subfoldersOfFolder2 update FullPath column (query omitted)
  7. Commit transaction

Obviously, I do not want any other transactions to write (or even read) folder2 and subfoldersOfFolder2 until my transaction completes.

After reading this article on SQL Server transactions I got the notion that setting isolation level to Serializable at step #1 would help me achieve this. But for some reason this doesn't seem to happen. I tried leaving the transaction open (stopping before step #7), opening another instance of SSMS and doing SELECT * FROM Folders, and the query completes successfully, I can still see the data that was read by the 1st transaction.

Why is this happening? How can I prevent anyone else from reading/writing folder2 and subfoldersOfFolder2? I feel like I'm missing something important about how transactions actually lock data.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Nazz
  • 193
  • 3
  • 9
  • `Serializable` specifically means no concurrency, globally. Is that in fact what you want? If it isn't you should probably have the amending code first select all relevant parent nodes with `updlock, holdlock`. However that will only set the amending code straight, it will not prevent people from reading the paths of child folders that you are going to amend, but have not. Given that you will only know IDs of these child folders after you've locked and read the parents (race condition), it would appear you want to put a `tablock, xlock, holdlock` on the entire table each time you edit paths. – GSerg May 12 '17 at 16:16

3 Answers3

2

When you use Serializable, what this does is it keeps the shared locks (from the SELECT) on the rows you've read in place until the transaction completes. But a shared lock on a row does not prevent another transaction to read that same row ...... it just stops another transaction from getting an exclusive lock on that row (which the shared lock) for updating or deleting.

If you want to prevent any other transaction from even reading (SELECT) on those rows, you need to enforce an exclusive lock when you SELECT:

SELECT *
FROM dbo.Folders WITH (XLOCK)
WHERE ....

Now if this transaction "stays open", no other transaction can read any of the rows that are selected by that WHERE condition - until that SELECT .. FROM dbo.Folders WITH (XLOCK) transaction has been committed or rolled back.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
0

If you do the following it does exactly what you want (block readers and writers from seeing rows: In session 1:

create table dbo.test(i1 int, a1 varchar(25))
insert into dbo.test values (1,'NY'),(1,'NY'),(1,'NJ'),(2,'NY'),(2,'NY'),(2,'NJ') 
set transaction isolation level serializable
begin transaction
select * from test where i1=1
update dbo.test set i1=3 where a1='NJ'

In session 2 trying

select * from dbo.test where i1=1 

hangs....

adding begin try and catch can only improve things but even without it serializable works. Your are not showing us all your code.

benjamin moskovits
  • 5,261
  • 1
  • 12
  • 22
  • If I remove the `update` statement from your code, session 2 does not hang. Why is this? The article I liked to states that `Database Engine keeps read and write locks acquired on selected data`, so the data that was read by `select` in session 1 should be locked, right? – Nazz May 12 '17 at 16:36
  • There is no way (no settings) to have SQL Server lock rows in a transaction that does not have any updates/inserts/deletes only selects. SQL Server is guaranteeing various levels of data consistency as a result of changes. No changes then whats to guarantee. You may want to explore snapshot level isolation which guarantees data consistency without locks.Everyone gets their own version of the data. – benjamin moskovits May 12 '17 at 16:54
  • @benjaminmoskovits: *really?!?!?!* What about `WITH (XLOCK)` which places an **exclusive** lock on rows - this **prevents** even `SELECT` from those rows ..... – marc_s May 12 '17 at 17:01
  • 1
    I was wrong. A select with (xlock) on a select even without an update prevents selects against the rows. marc_s solution is correct. – benjamin moskovits May 12 '17 at 17:12
0

Changing serialisation is something I would only do as a last resort. In this case, I'd stick with a single transaction to update the parent ids but put the code to update the folder path in an update trigger.

cloudsafe
  • 2,444
  • 1
  • 8
  • 24