0

I have researched a lot and I haven't found anything that meets my needs. I'm hoping someone from SO can throw some insight into this.

I have an application where the expected load is thousands of jobs per customer and I can have 100s of customers. Currently it is 50 customers and close to 1000 jobs per each. These jobs are time sensitive (scheduled by customer) and can run up to 15 minutes (each job).

In order to scale and match the schedules, I'm planning to run this as multi threaded on a single server. So far so good. But the business wants to scale more (as needed) by adding more servers into the mix. Currently the way I have it is when it becomes ready in the database, a console application picks up first 500 and uses Task Parallel library to spawn 10 threads and waits until they are complete. I can't scale this to another server because that one could pick up the same records. I can't update a status on the db record as being processed because if the application crashes on one server, the job will be in limbo.

I could do a message queue and have multiple machines pick from it. The problem with this is the queue has to be transactional to support handling for any crashes. MSMQ supports only MS DTC transaction since it involves database and I'm not really comfortable with DTC transactions, especially with multi threads and multiple machines. Too much maintenance and set up and possibly unknown issues.

Is SQL service broker a good approach instead? Has anyone done something like this in a production environment? I also want to keep the transactions short (A job could run for 15,20 minutes - mostly streaming data from a service). The only reason I'm doing a transaction is to keep the message integrity of queue. I need the job to be re-picked if it crashes (re-appear in the queue)

Any words of wisdom?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
Alex J
  • 1,547
  • 2
  • 26
  • 41
  • Service broker has the advantage that you can use it with normal SQL transactions. No MSDTC. MSDTC is unsupported for many HA solutions, so it is often impossible to use. – usr May 18 '14 at 12:48
  • I currently use SSB in ten different applications at my current client (not my choice of queue engine). I would not recommend it to anyone. It's slow, there are no good tools for it, and it is quite inflexible. Use RabbitMQ or ZeroMQ instead. – jgauffin May 20 '14 at 06:05

1 Answers1

0

Why not having an application receive the jobs and insert them in a table that will contain the queue of jobs. Each work process can then pick up a set of jobs and set the status as processing, then complete the work and set the status as done. Other info such as server name that processed each job, start and end time-stamp could also be logged. Moreover, instead of using multiple threads, you could use independent work processes so as to make your programming easier.

[EDIT] SQL Server supports record level locking and lock escalation can also be prevented. See Is it possible to force row level locking in SQL Server?. Using such mechanism, you can have your work processes take exclusive locks on jobs to be processed, until they are done or crash (thereby releasing the lock).

Community
  • 1
  • 1
Tarik
  • 10,810
  • 2
  • 26
  • 40
  • ok, what happens when the work process picks up a set of jobs, mark them as processing and then crash before completion. now you have a set of jobs in the db that are marked as processing but will never be processed? – Alex J May 18 '14 at 19:53
  • Assuming your RDBMS supports record level locking, you could have this record locked by the worker process in question. Upon the client crashing the record will be unlocked. Another option is to have a time out after which another worker process takes over. Ownership of the job is changed to belong to that server/work process. – Tarik May 20 '14 at 05:54