2

Lets say I have big SQL table with "scheduled tasks" - each row has time of day, when the action should be executed - for example send email. And there are multiple workers (as Azure recommends) to process these actions. How to implement this in a way, that every row will be processed only once?

I know about Azure Queues, they are awesome and I am using them for all cases where work "comes from outside", and they take care about "only one consumer will read the message" and also "when the consumer crash, the message will be given to another in a minute". However, I dont see how to implement this for recurring scheduled jobs. I could implement some "scheduler" worker to read from SQL DB and insert to queue, but then this worker become the same problem - there can be only one, therefore no scaling, and if it fail (crash, update), all the work stops.

Also I've read about Using Tables as Queues (or here), but they explicitely warn about heavy locking (escalating to table locks), which will become a problem when the table is subject to frequent changes - and that is my case, there are frequent inserts of new items.

Any ideas besides this SQL locks? Anything special on Azure to help me with this?

Community
  • 1
  • 1
rouen
  • 5,003
  • 2
  • 25
  • 48
  • Is it really a problem if there is a 10-15 minute delay in one of the jobs? We run an ETL processing system with a single-instance "scheduler" that hands work out to a bunch of process nodes. Since it is not real-time, and most of the jobs take >15 minutes to run, a scheduler crash/update is hardly noticeable. I realize that breaks the SLA, but in practice it's only down about 15 minutes/month. – mellamokb Jul 07 '15 at 15:21
  • You can disable lock escalation on the table so that won;t be a problem, how many rows you are talking about in a table? The other option is to have a single scheduler that schedules work to multiple workers. The job of scheduler is to schedule the work to a worker and persist it in a database and the worker responds on completion. Your scheduler can be designed smart to reassign when needed, – Satya_MSFT Jul 07 '15 at 20:06
  • Yes, in this scenario 10-15 min is unacceptable delay. This sould really be uninterupted as much as possible. – rouen Jul 09 '15 at 12:09

1 Answers1

2

For now I decided to try implementation based on SQL Server Process Queue Race Condition

So I am using ADO.NET (not EF as usual).

  1. open SQL connection
  2. begin transaction with isolation level RepeatableRead
  3. within this transaction execute select

SELECT TOP 100 * FROM myTable WITH (ROWLOCK, READPAST, UPDLOCK) ...

UPDLOCK will solve "each row can be read be only one worker"

READPAST will solve concurrency - workers will skip rows locked by another workers and take next ones

ROWLOCK will prevent lock escalation to table lock (thus deadlocking other workers)

I am using batch size of 100 rows - just quess for now - compromise between to many concurrently existing row locks, and too many queries

  1. read the result into memory

  2. within the transaction execute UPDATE command to update processed date of these rows

  3. commit transaction

  4. if commit OK, execute actions over data from rows (for example sending emails)

There is important compromise - in order to make transactions as quick as possible, actions are executed after transaction is finished. So if worker crash just after commit and before all actions are processed, the part of the batch will not be processed. (this is not the matter of life or death in this scenario). From this perspective, ideal batch would be 1, but the performance implication of 100x more queries would be major.

Tested this with 3 workers on the table of 100 000 items (with several hundreds of items "scheduled" for every minute) and it seems to work OK. No deadlock, queries are reasonable fast (about 30% slower against single worker - probably because of more row locks).

Community
  • 1
  • 1
rouen
  • 5,003
  • 2
  • 25
  • 48