1

I'm working on an application that communicates with a web service. The client application ( silverlight-4.0) will call the web service and trigger a long-running task. Because it takes some time for the task to finish, it is executed in a separate thread. (Using System.Threading.Tasks.Task.Factory.StartNew() to create the separate task.) After starting the task, the service call will return with an ID and the connection is done.
This ID should be something I need to identify the task so I can communicate with it.

A next call, possibly using another connection, is done to check if the task is done already. For this purpose, the ID is part of the call. On the server, it now needs to check if the task is still running or if it's done. How do I find this task again?


The service is running on Azure and because of load balancing, the second call could be on a completely different system. It seems to me that this cannot be done, but then again...
This Q is related to this Q.

Community
  • 1
  • 1
Wim ten Brink
  • 25,901
  • 20
  • 83
  • 149

3 Answers3

4

Have the task report its status in table storage / appfabric cache. Then whe nsomeone polls the status simply read the corresponding status for task ID X from the persistence mechanism used.

maartenba
  • 3,344
  • 18
  • 31
  • +1 to both answers - same idea at the same time. In extreme situations - where really close coupling is needed between client and server - then you could also use the sticky session pattern - http://dunnry.com/blog/2010/10/14/StickyHTTPSessionRoutingInWindowsAzure.aspx - but this is not for simple situations (IMO) – Stuart Jul 14 '11 at 09:32
  • I have to be honest here and say that maartenba was about 4 seconds quicker than me. – David Steele Jul 14 '11 at 09:56
  • The use of table storage/appfabric cache is tempting and would simplify much of the polling. But the worker role suggested by David Steele sounds interesting too... – Wim ten Brink Jul 14 '11 at 14:08
  • Also see this question (it relates to ajax, but I'm sure you can adapt it) for some code: http://stackoverflow.com/questions/6184752/set-timeout-for-controller-action/6192680#6192680 – knightpfhor Jul 14 '11 at 22:18
  • The worker role is in fact a good solution, but to be able to scale that to "infinity" you wil also have to implement some sort of polling strategy that uses shared storage. – maartenba Jul 15 '11 at 10:02
3

The proper way of doing this is using a queue based communication. The reason for this is scalability. You want "an instance" of your service to pick up the request, and you want "an instance" to return the result to a client right?

You can take a quick look and one of my blog posts about AppFabric Queues but they are too bulky for this. Here's how I'do it:

Create a WorkerRequest class, looking something like this

public class WorkerRequest {
   string clientId;
   MyTaskEnum taskToPerform;
}

Write to the Queue storage, (In my production code I'm using a wrapper which I haven't blogged about yet, but plan to :) ), add the request.

Have a worker thread listen to this queue, and when a request is received, spawn a new thread to finish it. When you're done, write to table storage with your task & client id as your keys. That way you can always check the status (a simple /GET/ request to the table) + you have decoupling & scalability already solved.

Hope it helps.

UPDATE: wanted to explain a bit more, so I decided to update the post =)

You can create a WCF web service in a "web role", which is what I would do. I blogged about it a while ago. In the same role, you create a Worker. You do that by having a class that implements RoleEntryPoint. This class (located in Microsoft.WindowsAzure.ServiceRuntime) looks like:

  public abstract class RoleEntryPoint
  {
    public virtual bool OnStart()
    {
      return true;
    }

    public virtual void Run()
    {
      Thread.Sleep(-1);
    }

    public virtual void OnStop()
    {
    }
  }

You simply implement a while(true) loop in the Run, that asks the Queue if there are any new messages for processing. When such a message is received do not spawn a new one, just process it. If you want to scale it, you can scale with adding new instances. Now obviously, this can be costly, so in fact it is wise to spawn a new thread, but only to a certain limit, e.g. max 5 threads. If there are no threads in your pool, return the message to the queue (you need to call Complete() when you're done with the message, otherwise it doesn't necessarily get deleted). It will get picked up later, or by another worker.

So, when the worker thread finishes, write the result to table storage and you're done.

Anže Vodovnik
  • 2,325
  • 16
  • 25
2

The task just needs to persist somewhere that it is completed and store the result if a result is expected.

You could store the completion with the Task ID in Table Storage or SQL Azure, wherever you have available. The subsequent polling to see if it is completed could just check this storage and return whether it is completed or not.

The other way to solve this problem is to have the long running task run in a worker role. If this worker role exposed an internal endpoint then any of the web roles would be able to ask the worker role if it had finished.

David Steele
  • 3,433
  • 21
  • 23
  • Worker roles? Hmmmm... Interesting concept. Haven't thought about those. – Wim ten Brink Jul 14 '11 at 14:20
  • If the tasks are long running but not too CPU or IO intensive then you can run them in a separate thread on your web role and save money. If they are hardworking tasks then it may be best to push them off into a separate worker role so as to not have an adverse effect on your web role. – David Steele Jul 15 '11 at 06:52