Queuing long running tasks in a web application

Question

A user can perform an action on our web app which takes anywhere from 100ms to 10 seconds, I wish to return a result to the browser immediately and then show the results to the user once the task has finished processing. The action is syncing data from a third party and is implemented as a class library (DLL).

Normally it's suggested to use a queue like RabbitMQ or MSMQ and have a worker which writes the results to a database which is polled by an AJAX request from the browser to check for updates.

However the aim is to reduce the latency so it's as close to running the task synchronously as possible while being able to handle spikes in processing the long running task without affecting the rest of the website.

How should the backend be architected? In my mind, the process would be: starting the task, running the task with minimal latency, notifying the end user the task is finished (ASAP) and finally displaying results in the browser.

Long Running Task. Credits: Haishi. Source: <code>http://haishibai.blogspot.co.uk/2012/12/dealing-with-long-running-jobs.html</code>

Examples

Generating sitemaps with http://www.xml-sitemaps.com/ uses chunked transfer encoding to send a <script> tag every second to call a Javascript function to update the page with the latest status.

Checking SSL certificates with https://www.ssllabs.com/ssltest/ seems to refresh the whole page with an updated status.

Using AngularJS you can make an async http call and return a "promise". In any case, also jQuery's http allows you to define callbacks for the call that don't block the browser. The solution that you're proposing is not really "state of the art" :) I don't see why downvoting, the guy has a question, please. — Rafa, Aug 21 '14 at 11:04
@Rafa Thanks however if I used an async jQuery request with callback to execute the task, that assumes the task will be executed by the IIS worker thread. That means a bunch of users clicking the button around the same time could effectively DOS the website. I'm looking for the tasks to be in a separate thread. — Marcus, Aug 21 '14 at 11:25
Have you tried AngularJS? I think the "promise" and the data binding features would do the job: http://blog.brunoscopelliti.com/angularjs-promise-or-dealing-with-asynchronous-requests-in-angularjs — Rafa, Aug 21 '14 at 11:31
Have you taken a look at SignalR? It can effectively update client side in real time using WebSockets so you don't have TCP connections opening and closing constantly. Clients can be sent data from outside of the `Hubs` as well, so you can call it from anywhere in your app. — siva.k, Aug 21 '14 at 11:32
@siva.k, SignalR looks good for keeping the browser updated on the current status of their task rather than polling with an AJAX request. But how would the backend work for actually queuing the task, running the task with minimal latency and then getting the results back to the browser? — Marcus, Aug 21 '14 at 11:36
@Marcus That's a far more complex question and depends largely on what you're doing. The simplest method is to use the `Queue` enumerable and batch out your jobs using `Parallel.Invoke()` with the jobs being methods that can update the client on completion. Then unless you're under heavy load requests would be processed in near real time, and under heavy load you don't overload the server by enforcing a maximum number of running tasks at a given time. — siva.k, Aug 21 '14 at 11:46
@Siva.k Thanks for the input, though I'm looking for something a bit more robust. For example I found this: http://stackoverflow.com/questions/1317641/queue-based-background-processing-in-asp-net-mvc-web-application However I've no idea how I get the result back to the browser. Really I'm wanting it to appear as synchronous as possible for the end user, while we're able to handle load spikes and do the processing outside IIS and possibly on other servers too. Another resource: http://msdn.microsoft.com/en-us/magazine/hh580729.aspx — Marcus, Aug 21 '14 at 12:07
You've stated "notifying the end user the task is finished" but I'd suggest you create a RESTful interface and don't use a realtime solution like SignalR. Send the client a 202 (Accepted) with a "ticket" (the HTTP ETag, perhaps). The ticket lets them poll the server for their resource. — Boggin, Aug 25 '14 at 15:44
Have you considered Azure? Web Role or Website, Queue or Service Bus, Worker Role. PaaS means it's all there for you and it leads to much simpler architecture. — Boggin, Aug 25 '14 at 15:48
I know Azure has these services available. Installing queues or service buses on our own infrastructure is not a problem. I just don't know what technologies we should be using to make this happen. — Marcus, Aug 26 '14 at 09:13

Tengiz · Answer 1 · 2014-08-28T03:25:54.603

4

This situation is relatively simple, and I would not recommend polling at all.

Consider using a regular Ajax approach: part of the page is able to refresh without the rest of the page. So that part (ajax part) is synchronous on its own, but asynchronous from the whole page's point of view (because it refreshes without reloading the whole page).

So, when that information is required to be calculated, ajax part of the page is submitted as a regular request. When the request processing is done, that part of the page has access to the response right away and displays the results.

Advantage is that you don't have polling overhead, as well as the results are displayed on the screen right away (ASAP - as you asked). Also, only one request is working on this, instead of several possibly missed requests when polling.

edited Aug 28 '14 at 03:25

answered Aug 26 '14 at 21:54

Tengiz

8,011
30
39

1

Regular AJAX approach will kill scalability, if these long running tasks/requests are too much (as OP mentioned in comments website supposed to have many such requests), because IIS working threads are going to handle those AJAX requests, it is possible that IIS will run out of it's working threads being busy handling long running tasks/requests. – Michael Aug 26 '14 at 22:09
What's the difference if the threads are going to be running in the same scale as the requests? – Tengiz Aug 27 '14 at 14:56
The difference is, you should not keep IIS working threads busy doing long running tasks, instead you should put workload on other threads. With regular AJAX request, this long running tasks are going to be processed by IIS working threads, and the idea is, to let other threads to process long running tasks. – Michael Aug 27 '14 at 15:12
Yes, I agree to that point. However, polling is a request too, and that's a long-running request particularly. So, my answer was against polling and that applies completely, and still better in my opinion. Polling cannot exist without requests - so the cost is the same at the end in terms of running requests. And advantage of my approach is still in place - response is rendered without a wait for the polling response. – Tengiz Aug 27 '14 at 18:59
Yes, I wasn't trying to make a point that polling is better over regular AJAX. I just thought you're answering OP's question, and suggesting using regular AJAX, which should be the way to go, and which is better than polling – Michael Aug 27 '14 at 20:38
In addition, my point was: there is no way to let the web UI know about things done on the server without some kind of polling (unless the connection is duplex, which is not natively cross-browser supported). So, I meant, the best way to do it with one-way requests, would be to have ajax page. So yes, I was answering the question by stating what way should be taken, and by (as a side note) mentioning not to use polling because that's an obvious alternative. – Tengiz Aug 27 '14 at 21:02
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/60119/discussion-between-michael-moore-and-tengiz). – Michael Aug 27 '14 at 21:05

score 2 · Answer 2 · answered Aug 27 '14 at 12:14

Have you considered using WF4 in conjunction with SignalR?

We use WF4 to handle back end processing and it performs quite nicely. We store requests in a job request table, the workflow engine (a service we wrote that runs wf4 in the backend) picks up the request, processes the work and then marks the job as completed.

SignalR can then be used to inform the client that the job is complete. Scaling is relatively easy (chuckling as I know 'easy' its always fraught with details) as you can spin up more services to process requests. Each engine would mark the request as being processed so the others don't pick it up.

I've used wf4 on large scale projects where the services were load balanced and we were able to get very decent throughput.

I'll certainly look into it, though I think it might be overkill in this case. There's only one time this long running task will be used in the whole application and the call should take no longer than a couple of seconds. The main aim is to ensure lots of users clicking at the same time don't drown the app. — Marcus, Aug 27 '14 at 22:25

Mrchief · Answer 3 · 2014-08-27T20:08:46.860

FWIW, if you really don't want to invest in a full scale queue based solution, you can leverage TPL + SignalR (or any such Comet library) to process your long running request and sending feedback to client.

So the idea is:

Client sends a request to server
Server kicks off background processing via TPL
Server notifies client of updates via SignalR

Something like this (using TPL and SignalR):

// server

public class MyHub : Hub
{

    public void Start()
    {
         var longRunningTask = Task.Factory.StartNew(() =>
            {
                var someService = new someService();
                // do stuff
                someService.doSomething();
                Clients.All.longRunningTask(<data>);    

            }, TaskCreationOptions.LongRunning);
    }
}

// initiating from your asp.net page (codebehind or via ajax or any which way)
new MyHub().Start();

// client

var hub = $.connection.myHub;
hub.client.longRunningTask = function (data) {
     // do something with data
}

This is very low latency, involves no queuing and allows you push updates as they come along (simply calling Clients.All.longRunningTask will push an update to client).

References:

Getting Started with SignalR 2

Delaney · Answer 4 · 2014-08-26T14:01:36.567

This may be a bit simplistic, but the easiest way may be long polling on the client.

ie: Send the work request in a GET query, get back a token. Then immediately request the result of that token, and let the server block on a response. If the request times out, restart it. On the server side, just block on that thread while waiting for the task to finish. Even the most aggressive load balancers will allow for up to a minute of hang time.

By putting control of the queue on the client, you can more easily prevent aggressive client-side clicking by disabling submission or artificially limiting the number of outstanding server requests using a jQuery.ajaxPrefilter (or any other approach which feels right to you.)

Alternatively you could invest in a websocket communication layer so the server can proactively ping the client, but long polling is a common fallback to sockets which is more than enough for this use case (I think... depends on how often/many of these such calls you need).

Seems like this would work for the front end. As for the back end, are you suggesting any standard queue and storing the results in a persistent data store like a database? — Marcus, Aug 27 '14 at 22:16
Depends on your back-end architecture, but that needn't be complex. You could continue the calculation on a standard web response thread (send & close response, but keep working) and store it in a session variable for all it matters. — Delaney, Aug 31 '14 at 18:07

Queuing long running tasks in a web application

4 Answers4