SQL Threadsafe UPDATE TOP 1 for FIFO Queue

Question

I have a table of invoices being prepared, and then ready for printing.

[STATUS] column is Draft, Print, Printing, Printed

I need to get the ID of the first (FIFO) record to be printed, and change the record status. The operation must be threadsafe so that another process does not select the same InvoiceID

Can I do this (looks atomic to me, but maybe not ...):

1:

WITH CTE AS
(
    SELECT TOP(1) [InvoiceID], [Status]
    FROM    INVOICES
    WHERE   [Status] = 'Print'
    ORDER BY [PrintRequestedDate], [InvoiceID] 
)
UPDATE CTE
SET [Status] = 'Printing'
    , @InvoiceID = [InvoiceID]

... perform operations using @InvoiceID ...

UPDATE  INVOICES
SET [Status] = 'Printed'
WHERE   [InvoiceID] = @InvoiceID

or must I use this (for the first statement)

2:

UPDATE INVOICES
SET    [Status] = 'Printing'
    , @InvoiceID = [InvoiceID]
WHERE  [InvoiceID] = 
(
    SELECT TOP 1 [InvoiceID]
    FROM    INVOICES WITH (UPDLOCK)
    WHERE   [Status] = 'Print'
    ORDER BY [PrintRequestedDate], [InvoiceID] 
)

... perform operations using @InvoiceID ... etc.

(I cannot hold a transaction open from changing status to "Printing" until the end of the process, i.e. when status is finally changed to "Printed").

EDIT:

In case it matters the DB is READ_COMMITTED_SNAPSHOT

I can hold a transaction for both UPDATE STATUS to "Printing" AND get the ID. But I cannot continue to keep transaction open all the way through to changing the status to "Printed". This is an SSRS report, and it makes several different queries to SQL to get various bits of the invoice, and it might crash/whatever, leaving the transaction open.

@Gordon Linoff "If you want a queue" The FIFO sequence is not critical, I would just like invoices that are requested first to be printed first ... "more or less" (don't want any unnecessary complexity ...)

@Martin Smith "looks like a usual table as queue requirement" - yes, exactly that, thanks for the very useful link.

SOLUTION:

The solution I am adopting is from comments:

@lad2025 pointed me to SQL Server Process Queue Race Condition which uses WITH (ROWLOCK, READPAST, UPDLOCK) and @MartinSmith explained what the Isolation issue is and pointed me at Using tables as Queues - which talks about exactly what I am trying to do.

I have not grasped why UPDATE TOP 1 is safe, and UPDATE MyTable SET xxx = yyy WHERE MyColumn = (SELECT TOP 1 SomeColumn FROM SomeTable ORDER BY AnotherColumn) (without Isolation Hints) is not, and I ought to educate myself, but I'm happy just to put the isolation hints in my code and get on with something else :)

Thanks for all the help.

What you've got in #1 should work. #2 looks to be logically the same (only you haven't included the ultimate update to printed). The only issues would be if the task fails (i.e. do you put the status back from `printing` to `print` on error / do you try again), and what happens if another thread picks up the next item to be printed and completes before the first thread completes / is there a potential race condition in the app logic (outside of what's shown in this question). — JohnLBevan, Mar 17 '18 at 15:06
If you want a queue, perhaps you should consider message queues: https://learn.microsoft.com/en-us/sql/integration-services/control-flow/message-queue-task. — Gordon Linoff, Mar 17 '18 at 15:06
[SQL Server Process Queue Race Condition](https://stackoverflow.com/questions/939831/sql-server-process-queue-race-condition/940001#940001) - `WITH (ROWLOCK, READPAST, UPDLOCK)` — Lukasz Szozda, Mar 17 '18 at 15:10
@lad2025 That would be great, but since Kristen can't use transactions (see her last sentence) I'm not sure locks would help. — JohnLBevan, Mar 17 '18 at 15:14
@JohnLBevan - every statement is its own transaction if not in an explicit transaction. — Martin Smith, Mar 17 '18 at 15:16
looks like a usual table as queue requirement http://rusanu.com/2010/03/26/using-tables-as-queues/ — Martin Smith, Mar 17 '18 at 15:18
@MartinSmith agreed; but how do lock hints help here? i.e. Since the statement is atomic what benefit is there? — JohnLBevan, Mar 17 '18 at 15:18
Atomicity doesn't guarantee what you think. It just means the statement is entirely committed or entirely rolled back. Lock hints control the isolation. It prevents the data from being changed between being read and updated or being read by two concurrent processes and is less blunt than doing this with isolation levels - achieving higher concurrency. — Martin Smith, Mar 17 '18 at 15:21
So is the issue that 2 threads querying at the same time may have 1 complete and the other fail; whereas by adding the hints the 2nd thread gets the 2nd message in the queue and thus can process that whilst the 1st thread takes the first? — JohnLBevan, Mar 17 '18 at 15:27
Or have I misunderstood the idea of a statement; where the second thread can intercept the same record as is being handled by the first thread between the row being fetched by the cte query and updated? — JohnLBevan, Mar 17 '18 at 15:28
You have a problem with your second update may not find the same row — paparazzo, Mar 17 '18 at 15:29
No - the issue is that without locking hints at default isolation level two threads can both identify the same row with `[Status] = 'Print'` as the shared locks don't conflict then both proceed to update in turn. At default read comitted level SQL Server can take S locks then release them as soon as the row is read. The `UPDLOCK` means that two concurrent threads won't be able to take this out on the same row. — Martin Smith, Mar 17 '18 at 15:29
Thanks for all your comments, amazing how fast support is here :). My dilemma was that the Get-Invoice-ID is also updating the Status, so nothing else should be able to find that Invoice-ID with original status. BUT ... would that be safe between two processes in a race-condition - i.e. can another SELECT find the same row before the first process has done the UPDATE, or is that only guaranteed with an isolation hint? (In case it matters the DB is READ_COMMITTED_SNAPSHOT ) — Kristen, Mar 17 '18 at 15:46
@paparazzo Why? I get the unique ID at the first, and then use that to update at the second. Am I overlooking something? — Kristen, Mar 17 '18 at 15:55
@Kristen I may have read the question incorrectly. It seems strange to me to update top then update all. It seems like another process could clobber. What are the keys. — paparazzo, Mar 17 '18 at 16:07
@MartinSmith Wow I figured it a single update would act like a transaction and automatically take the lock — paparazzo, Mar 17 '18 at 16:22
FYI: I've submitted an answer with my thoughts. I'm probably wrong as it seems a lot of blog posts also suggest placing the locks suggested, but I was unable to get any exception or sign of a race condition in my testing... What symptom would we see; maybe SQL hides the issue but we get worse performance because in the background it's having to retry? Or maybe the issue only shows under really heavy concurrency / where requests are coming from multiple machines (i.e. since both of my test threads are in the same process they can't come exactly at once)? — JohnLBevan, Mar 17 '18 at 18:07
ps. Reading other answers I should point out that I'm assuming that there's no other process running which would amend the status of a `printing` record back to `print` or on to `printed`; i.e. All processes which are working on this queue have the same aim of picking up the next available invoice and printing it / there is no "ChaosMonkeyThread" process to randomly update record values that aren't locked. — JohnLBevan, Mar 17 '18 at 18:14
@Kristen: Is the `InvoiceId` column unique in the `Invoices` table? I'd blindly assumed so, but that will have a significant impact on the validity of any answers. — JohnLBevan, Mar 17 '18 at 18:18
@JohnLBevan Yes, InvoiceID unique (I sought to imply that, I should have stated it). "I'm assuming that there's no other process running which would amend the status of a printing record back to print or on to printed;" ... good point. Yes you are correct ... however ... the User can change the status to `REPRINT`, and that could presumably happen during a print process. IF the SSRS Report fails then the record will be left at `PRINTING`, so the user needs to sort that out (b changing status to `REPRINT`), but clearly some collision-risk which I need to take care of. — Kristen, Mar 17 '18 at 18:29
FYI: Since this comment thread has scared me in thinking my knowledge may be at fault, I've started another thread on the specific concern here: https://stackoverflow.com/questions/49341943/why-are-locks-needed-on-an-atomic-statement — JohnLBevan, Mar 17 '18 at 21:11
Last comment on this thread; I promise. People are talking about a dedicated high performance FIFO Queue table; @Kristen's talking about an Invoices table which happens to have a status column with a couple of SSRS reports running against it. The advise in the blog/various comments is for the former. See https://stackoverflow.com/a/49342363/361842 — JohnLBevan, Mar 17 '18 at 22:10

paparazzo · Answer 1 · 2018-03-17T17:45:06.247

0

My concern would be duplicate [InvoiceID]
Multiple print requests for the same [InvoiceID]

On the first update ONE row gets set [Status] = 'Printing'

On the second update all [InvoiceID] rows get set [Status] = 'Printed'
This would even set rows with status = 'draft'

Maybe that is what you want

Another process could pick up the same [InvoiceID] before the set [Status] = 'Print'

So some duplicates will print and some will not

I go with comments on use the update lock

This is non-deterministic but you could just take top (1) and skip the order by. You will tend to get the most recent row but it is not guaranteed. If you clear the queue then you get em all.

This demonstrates you can lose 'draft' = 1

declare @invID int; 
declare @T table (iden int identity primary key, invID int, status tinyint);
insert into @T values (1, 2), (5, 1), (3, 1), (4, 1), (4, 2), (2, 1), (1, 1), (5, 2), (5, 2);
declare @iden int;
select * from @t order by iden;

declare @rowcount int = 1; 
while (@ROWCOUNT > 0)
    begin
        update top (1) t 
        set t.status = 3, @invID = t.invID,  @iden = t.iden
        from @t t 
        where t.status = '2';
        set @rowcount = @@ROWCOUNT;
        if(@rowcount > 0)
            begin 
                select @invID, @iden;
                -- do stuff  
                update t 
                set t.status = 4
                from @t t
                where t.invID = @invID; -- t.iden = @iden;
                select * from @T order by iden;
            end
    end

edited Mar 17 '18 at 17:45

answered Mar 17 '18 at 16:54

paparazzo

44,497
23
105
176

NB: `set status = 'Printed'` has a `where` condition: `WHERE [InvoiceID] = @InvoiceID`. – JohnLBevan Mar 17 '18 at 18:11
@JohnLBevan I know it does. What is your point? Have you tested this? – paparazzo Mar 17 '18 at 18:13
I was commenting on your point: "On the second update all [InvoiceID] rows get set [Status] = 'Printed' This would even set rows with status = 'draft'". However maybe we've made different assumptions... I'm assuming that `InvoiceId` is unique in that table. i.e. under my assumption what you'd said was irrelevant/confusing; but if you're assuming it's not unique then you have a point... – JohnLBevan Mar 17 '18 at 18:17
Why would you assume it is unique? It is a print queue. – paparazzo Mar 17 '18 at 18:19
1

Yes, InvoiceID is unique. Sorry, should have said that in the O/P. The reason to sort on Date is so that it operates as a FIFO based on WHEN the Invoice was set to PRINT status, not when the Invoice Record was first created, as a draft, and at that time given IDENTITY in InvoiceID. Sorry, I have used example column names with the intention of implying that, but next time I will explain explicitly in the O/P – Kristen Mar 17 '18 at 18:22
"On the first update ONE row gets set [Status] = 'Printing' " - I am not convinced (and @MartinSmith confirms this point) that ONLY ONE process [in a race condition] can get the InvoiceID. Two could simultaneously get the same record, and then both could update it. This seems to violate ATOMicity to me, but I'm happy to learn that this is how it works, and that I *must* use Isolation Hints. This is indeed why I asked the question :) as I was confident of ATOM just not convinced that it was "safe" in a race-condition – Kristen Mar 17 '18 at 18:25
@Kristen Then don't be convinced. An update withOUT the select as discussed and I demonstrate is atomic. No kidding top (1) only update one. – paparazzo Mar 17 '18 at 18:33
"you could just take top (1) and skip the order by" - I need FIFO, but even if I didn't I am curious how a plain `UPDATE TOP 1` would not suffer from the same isolation issue ... Yes, to me everything here looks ATOMic, except that the points that @MartinSmith has raised are exactly what caused to me to ask the O/P :) I am failing to see how a plain `UPDATE TOP 1` is different to an `UPDATE MyTable WHERE MyColumn = (SELECT TOP 1 SomeColumn FROM SomeTable)`, except that I have read that "it isn't safe without isolation hints". That's the stumbling block that I need to properly understand. – Kristen Mar 17 '18 at 18:35
"An update with OUT the select as discussed and I demonstrate is atomic" ... OK, then I need a Atomic, thread-safe, way to update the ONE ROW using a SORT please. On reflection I now agree that if I did not need the sort then a plain `UPDATE TOP 1` would be fine. But ... I need a sort ... – Kristen Mar 17 '18 at 18:38
I give up. The update takes a lock on the table. If an update with a simple hard coded where suffered from what you fear then bad stuff would happen. – paparazzo Mar 17 '18 at 18:41
Yes, I give up too, but thanks for your help and input. I have decided just to use the isolation hints without educating myself "why". – Kristen Mar 17 '18 at 18:52

JohnLBevan · Answer 2 · 2018-03-17T20:12:31.177

ATOMicity of a Single Statement

I think your code's fine as it is. i.e. Because you have a single statement which updates the status to printing as soon as the statement runs the status is updated; so anything running before which searches for print would have updated the same record to printing before your process saw it; so your process would pick a subsequent record, or any process hitting it after your statement had run would see it as printing so would not pick it up`. There isn't really a scenario where a record could pick it up whilst the statement's running since as discussed a single SQL statement should be atomic.

Disclaimer

That said, I'm not enough of an expert to say whether explicit lock hints would help; in my opinion they're not needed as the above is atomic, but others in the comments are likely better informed than me. However, running a test (albeit where database and both threads are run on the same machine) I couldn't create a race condition... maybe if the clients were on different machines / if there was a lot more concurrency you'd be more likely to see issues.

My hope is that others have interpreted your question differently, hence the disagreement.

Attempt to Disprove Myself

Here's the code I used to try to cause a race condition; you can drop it into LINQPad 5, select language C# Program, tweak the connectionstring (and optionally any statements) as required, then run:

const long NoOfRecordsToTest = 1000000;
const string ConnectionString = "Server=.;Database=Play;Trusted_Connection=True;";  //assumes a database called "play"
const string DropFifoQueueTable = @"
    if object_id('FIFOQueue') is not null 
    drop table FIFOQueue";
const string CreateFifoQueueTable = @"
    create table FIFOQueue 
    (
        Id bigint not null identity (1,1) primary key clustered
        , Processed bit default (0) --0=queued, null=processing, 1=processed
    )";
const string GenerateDummyData = @"
    with cte as
    (
        select 1 x
        union all
        select x + 1
        from cte
        where x < @NoRowsToGenerate
    )
    insert FIFOQueue(processed)
    select 0
    from cte
    option (maxrecursion 0)
    ";
const string GetNextFromQueue = @"
    with singleRecord as
    (
        select top (1) Id, Processed
        from FIFOQueue --with(updlock, rowlock, readpast) --optionally include this per comment discussions
        where processed = 0
        order by Id
    )
    update singleRecord
    set processed = null
    output inserted.Id";
//we don't really need this last bit for our demo; I've included in case the discussion turns to this..
const string MarkRecordProcessed = @"
    update FIFOQueue
    set Processed = 1
    where Id = @Id";
void Main()
{
    SetupTestDatabase();

    var task1 = Task<IList<long>>.Factory.StartNew(() => ExampleTaskForced(1));
    var task2 = Task<IList<long>>.Factory.StartNew(() => ExampleTaskForced(2));
    Task.WaitAll(task1, task2);
    foreach (var processedByBothThreads in task1.Result.Intersect(task2.Result))
    {
        Console.WriteLine("Both threads processed id: {0}", processedByBothThreads);
    }
    Console.WriteLine("done");
}

static void SetupTestDatabase()
{
    RunSql<int>(new SqlCommand(DropFifoQueueTable), cmd => cmd.ExecuteNonQuery());
    RunSql<int>(new SqlCommand(CreateFifoQueueTable), cmd => cmd.ExecuteNonQuery());
    var generateData = new SqlCommand(GenerateDummyData);
    var param = generateData.Parameters.Add("@NoRowsToGenerate",SqlDbType.BigInt);
    param.Value = NoOfRecordsToTest;
    RunSql<int>(generateData, cmd => cmd.ExecuteNonQuery());
}

static IList<long> ExampleTaskForced(int threadId) => new List<long>(ExampleTask(threadId)); //needed to ensure prevent lazy loadling from causing issues with our tests
static IEnumerable<long> ExampleTask(int threadId)
{
    long? x;
    while ((x = ProcessNextInQueue(threadId)).HasValue)
    {
        yield return x.Value;
    }
    //yield return 55; //optionally return a fake result just to prove that were there a duplicate we'd catch it
}

static long? ProcessNextInQueue(int threadId)
{
    var id = RunSql<long?>(new SqlCommand(GetNextFromQueue), cmd => (long?)cmd.ExecuteScalar());
    //Debug.WriteLine("Thread {0} is processing id {1}", threadId, id?.ToString() ?? "[null]"); //if you want to see how we're doing uncomment this line (commented out to improve performance / increase the likelihood of a collision

    /* then if we wanted to do the second bit we could include this
    if(id.HasValue) {
        var markProcessed = new SqlCommand(MarkRecordProcessed);
        var param = markProcessed.Parameters.Add("@Id",SqlDbType.BigInt);
        param.Value = id.Value;
        RunSql<int>(markProcessed, cmd => cmd.ExecuteNonQuery());
    }
    */

    return id;
}

static T RunSql<T>(SqlCommand command, Func<SqlCommand,T> callback)
{
    try
    {
        using (var connection = new SqlConnection(ConnectionString))
        {    
            command.Connection = connection;
            command.Connection.Open();
            return (T)callback(command);
        }
    }
    catch (Exception e)
    {
        Debug.WriteLine(e.ToString());
        throw;
    }
}

Other comments

The above discussion's really only talking about multiple threads taking the next record from the queue whilst avoiding any single record being picked up by multiple threads. There are a few other points...

Race Condition outside of SQL

Per our discussion, if FIFO is mandatory there are other things to worry about. i.e. whilst your threads will pick up each record in order after that it's up to them. e.g. Thread 1 gets record 10 then Thread 2 gets record 11. Now Thread 2 sends record 11 to the printer before Thread 1 sends record 10. If they're going to the same printer, your prints would be out of order. If they're different printers, not a problem; all prints on any printer are sequential. I'm going to assume the latter.

Exception Handling

If any exception occurs in a thread that's processing something (i.e. so the thread's record is printing) then consideration should be made on how to handle this. One option is to keep that thread retrying; though that may be indefinite if it's some fundamental fault. Another is to put the record to some error status to be handled by another process / where it's accepted that this record won't appear in order. Finally, if the order of invoices in the queue is an ideal rather than a hard requirement you could have the owning thread put the status back to print so that it or another thread can pick up that record to retry (though again, if there's something fundamentally wrong with the record, this may block up the queue).

My recommendation here is the error status; as then you have more visibility on the issue / can dedicate another process to dealing with issues.

Crash Handling

Another issue is that because your update to printing is not held within a transaction, if the server crashes you leave the record with this status in the database, and when your system comes back online it's ignored. Ways to avoid that are to include a column saying which thread is processing it; so that when the system comes back up that thread can resume where it left off, or to include a date stamp so that after some period of time any records with status printing which "timeout" can be swept up / reset to Error or Print statuses as required.

WITH CTE AS
(
    SELECT TOP(1) [InvoiceID], [Status], [ThreadId]
    FROM    INVOICES
    WHERE   [Status] = 'Print'
    OR     ([Status] = 'Printing' and [ThreadId] = @ThreadId) --handle previous crash
    ORDER BY [PrintRequestedDate], [InvoiceID] 
)
UPDATE CTE
SET [Status] = 'Printing'
, [ThreadId] = @ThreadId
OUTPUT Inserted.[InvoiceID]

Awareness of Other Processes

We've mostly been focussed on the printing element; but other processes may also be interacting with your Invoices table. We can probably assume that aside from creating the initial Draft record and updating it to Print once ready for printing, those processes won't touch the Status field. However, the same records could be locked by completely unrelated processes. If we want to ensure FIFO we must not use the ReadPast hint, as some records may have status Print but be locked, so we'd skip over them despite them having the earlier PrintRequestedDate. However, if we want things to print as quickly as possible, with them being in order when otherwise not inconvenient, including ReadPast will allow our print process to skip the locked records and carry on, coming back to handle them once they're released.

Similarly another process may lock our record when it's at Printing status, so we can't update it to mark it as complete. Again, if we want to avoid this from causing a holdup we can make use of the ThreadId column to allow our thread to leave the record at status Printing and come back to clean it up later when it's not locked. Obviously this assumes that the ThreadId column is only used by our print process.

Have a dedicated print queue table

To avoid some of the issues with unrelated processes locking your invoices, move the Status field into its own table; so you only need to read from the invoices table; not update it.

This will also have the advantage that (if you don't care about print history) you can delete the record once done, so you'll get better performance (as you won't have to search through the entire invoices table to find those which are ready to print). That said, there's another option for this one (if you're on SQL2008 or above).

Use Filtered Index

Since the Status column will be updated several times it's not a great candidate for an index; i.e. as the record's position in the index jumps from branch to branch as the status progresses. However, since we're filtering on it it's also something that would really benefit from having an index. To get around this contradiction, one option's to use a filtered index; i.e. only index those records we're interested in for our print process; so we maintain a small index for a big benefit.

create nonclustered index ixf_Invoices_PrintStatusAndDate  
on dbo.Invoices ([Status], [PrintRequestedDate])
include ([InvoiceId]) --just so we don't have to go off to the main table for this, but can get all we need form the index
where [Status] in ('Print','Printing')

Use "Enums" / Reference Tables

I suspect your example uses strings to keep the demo code simple, however including this for completeness. Using strings in databases can make things hard to support. Rather than having status be a string value, instead use an ID from a related Statuses table.

create table Statuses
(
    ID smallint not null primary key clustered --not identity since we'll likely mirror this enum in code / may want to have defined ids
    ,Name
)   
go
insert Statuses 
values 
(0, 'Draft')
, (1,'Print')
, (2,'Printing')
, (3,'Printed')

create table Invoices 
(
    --...
    , StatusId smallint foreign key references Statuses(Id)
    --...
)

Thanks."The SQL statement can be simplified like so" My understanding is that two (race) processes could select, and update, the same row :( (which was why I asked this question, in order to be sure on that point, and what Isolation was essential to prevent that). I, too, think I would find it very hard to simulate the worst-case race conditions. I think, on balance, it is no hassle to use the Isolation hints, so I will go with that approach. — Kristen, Mar 17 '18 at 18:19
@paparazzo ah good point; I made a mistake and missed that / it seems that that does invalidate my SQL, so the above test's wrong... I'll amend and rerun. — JohnLBevan, Mar 17 '18 at 18:20
@Kristen Makes sense. Also, ignore my suggested optimisation as I've missed the order by as commented on by paparazzo. — JohnLBevan, Mar 17 '18 at 18:21
ps. @Kristen there is one potential downside of including the `readpast` hint... if something else is locking the `'Print'` records (maybe some other process has a page lock which includes them) then the statement would skip them and process the next available records until their lock were freed... So depends if the Invoices table is only used as a print queue, or if it's part of the bigger system (i.e. as in the case of the former that scenario's not going to happen; for the latter there could be lots of things locking those records). Also depends how much you care about the ordered printing — JohnLBevan, Mar 17 '18 at 18:39
"then the statement would skip them and process the next available records until their lock were freed" Thanks. Luckily order is not critical, I just want to try to print them in FIFO order. Otherwise if there is a huge backlog an early one could be delayed being sent out. — Kristen, Mar 17 '18 at 18:53
ps. run of the updated statement completed; again no collisions. — JohnLBevan, Mar 17 '18 at 18:58
@Kristen: great, in that case definitely good to include it since having `readpast` will speed up the total processing time by avoiding delays caused by other processes. The ongoing discussion on whether the locks are needed for the queue is purely for academic interest. — JohnLBevan, Mar 17 '18 at 18:59