0

I need to lock a row, read and then update a field of this row in SQL Server and Python, for a program that can run multiple process at the same time, to avoid a race condition.

I'm using the following code, but in Python it updates one Id (i.e 3411) and returns another Id (i.e. 3071). If I run the same SQL code directly in SSMS, it updates and returns the same Id.

import pyodbc

def read_and_update_files_queue(self):
    conn = pyodbc.connect('Driver={SQL Server};'
                    'Server='+self.server+';'
                    'Database='+self.database+';'
                    'Trusted_Connection=yes;')
    cursor = conn.cursor()
    sql_query=f'''
                BEGIN TRANSACTION;
                    DECLARE @Temp_ID AS INT
                    SELECT TOP 1 @Temp_ID=Id 
                    FROM dbo.[Queue] 
                    WITH(XLOCK, ROWLOCK)
                    WHERE [BQ_status] IS NULL and upload_date IS NOT NULL

                    Select TOP 1 
                    Id,GCS_path,file_first_upload 
                    FROM dbo.[Queue]
                    WITH(XLOCK, ROWLOCK) 
                    WHERE Id=@Temp_ID


                    UPDATE dbo.[Queue] SET BQ_status='Loading' 
                    WHERE Id=@Temp_ID

                COMMIT TRANSACTION;
                '''
    cursor.execute(sql_query)
    cursor.commit()

What is wrong? The code is based on this: In SQL Server, how can I lock a single row in a way similar to Oracle's "SELECT FOR UPDATE WAIT"? Thanks!!

  • Do yourself a huge favor and to this transaction inside a stored procedure on the SQLSERVER with a TRY CATCH WITH A ROLLBACK. Also What are you accomplishing with eh second select statement? – perry147 May 19 '21 at 21:23
  • In the seconds statements I need to returns not only the ID, but also the fields called GCS_path,file_first_upload, that are used later in the python program. Why is it needed the try catch? (If I run this transaction when no ID doesnt exists, it works as expected (it does nothing) ) – Cristian Avendaño May 19 '21 at 21:33
  • 1
    Then do that with your First select statement. Just declare variables for GCS_path,file_first_upload and capture the values there to reuse later in the python program. Eliminate the second select statement. – perry147 May 19 '21 at 21:59

1 Answers1

1

Because the first two statements are not modifying data, you need the UPDLOCK hint, otherwise it will not hold the lock til the end of the transaction, irrespective of any other hints, including XLOCK.

But you could just do it all in a single statement, here no UPDLOCK is needed, and no explicit transaction (the statement runs in a transaction anyway)

UPDATE TOP (1) q
    SET BQ_status = 'Loading'
    OUTPUT inserted.Id, inserted.GCS_path, inserted.file_first_upload
FROM dbo.[Queue] AS q WITH (SERIALIZABLE, ROWLOCK)
    WHERE q.[BQ_status] IS NULL and q.upload_date IS NOT NULL;

While it is true that SERIALIZABLE sometimes only take a shared S lock, this is only true for either SELECT, or non-modified joined tables in DML statements.

In this instance, since Queue is the table being modified, a U lock is taken while finding the right row, this forbids U locks from other transactions, which would be blocked waiting (a transaction doing just SELECT would not be blocked). Then, when a row is about to be modified, the U lock is upgraded to X, and is modified.

So there is no fear of race conditions here, nor is the UPDLOCK hint necessary either.

Charlieface
  • 52,284
  • 6
  • 19
  • 43
  • Thanks. That works. But As far as I know, Serializable creates a Shared lock, so other process can still read the row, or am I wrong? I think that to avoid a race condition, the lock should be an exclusive lock and not a shared lock. – Cristian Avendaño May 20 '21 at 15:10
  • 1
    No that's only true for non-modified tables, otherwise a `U` lock is taken which will block other modifiers. You cannot force an `X` lock unless the table is being modified. Have clarified. – Charlieface May 20 '21 at 21:41