7

I have been researching (including looking at all other SO posts on this topic) the best way to implement a (most likely) Windows Service worker that will pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background (the work item management will all be handled in the asynchronous method). The work items will be web service calls and database queries. There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work. The examples below are very basic and are just there to highlight the logic of the while loop and for loop in place. Which is the ideal method or does it not matter? Is there a more appropriate/performant way of achieving this?

async/await...

    private static int counter = 1;

    static void Main(string[] args)
    {
        Console.Title = "Async";

        Task.Run(() => AsyncMain());

        Console.ReadLine();            
    }

    private static async void AsyncMain()
    {
        while (true)
        {
            // Imagine calling a database to get some work items to do, in this case 5 dummy items
            for (int i = 0; i < 5; i++)
            {
                var x = DoSomethingAsync(counter.ToString());

                counter++;
                Thread.Sleep(50);
            }

            Thread.Sleep(1000);
        }
    }

    private static async Task<string> DoSomethingAsync(string jobNumber)
    {
        try
        {
            // Simulated mostly IO work - some could be long running
            await Task.Delay(5000);
            Console.WriteLine(jobNumber);
        }
        catch (Exception ex)
        {
            LogException(ex);
        }

        Log("job {0} has completed", jobNumber);

        return "fire and forget so not really interested";
    }

Task.Run...

    private static int counter = 1;

    static void Main(string[] args)
    {
        Console.Title = "Task";

        while (true)
        {
            // Imagine calling a database to get some work items to do, in this case 5 dummy items
            for (int i = 0; i < 5; i++)
            {
                var x = Task.Run(() => { DoSomethingAsync(counter.ToString()); });

                counter++;
                Thread.Sleep(50);
            }

            Thread.Sleep(1000);
        }
    }

    private static string DoSomethingAsync(string jobNumber)
    {
        try
        {
            // Simulated mostly IO work - some could be long running
            Task.Delay(5000);
            Console.WriteLine(jobNumber);
        }
        catch (Exception ex)
        {
            LogException(ex);
        }

        Log("job {0} has completed", jobNumber);

        return "fire and forget so not really interested";
    }
svick
  • 236,525
  • 50
  • 385
  • 514
user2231663
  • 73
  • 1
  • 10
  • 1
    All of your code is wrong. Don't use `async void`; don't use async unless you actually have async operations, don't run async things without waiting for the,. – SLaks May 11 '16 at 17:35
  • 2
    You need to learn the basics of async. Read https://msdn.microsoft.com/en-us/magazine/jj991977.aspx – SLaks May 11 '16 at 17:36
  • @SLaks -> So how would you go about essentially performing multiple tasks in parallel in a single process like this? – user2231663 May 11 '16 at 18:19
  • Use Parallel LINQ. – SLaks May 11 '16 at 18:23
  • @SLaks You are not correct. There is nothing wrong with calling an async method without await in case you need some fire-and-forget style of things. Parallel LINQ is much worse both in matters of performance and obscurity.. – Zverev Evgeniy May 11 '16 at 18:33
  • @SLaks, would using Parallel LINQ not wait for all LINQ tasks to complete before continuing with the while loop 'tick'? If so this is not what I want. Ignore the async version then, I just want to start 'some work' in the background and continue listening for more work to start while the work continues in the background. Is Task.Run not suitable for this? – user2231663 May 11 '16 at 18:50
  • Then you probably do want `Task.Run`, but your question is not very clear. – SLaks May 11 '16 at 18:53
  • @user2231663, check [this](http://stackoverflow.com/a/25009215/1768303). – noseratio May 11 '16 at 19:48
  • @Noseratio, thanks, hadn't actually come across that one. It's a slightly different way of doing things. I guess you could build in a more dynamic allocation of workers in your outlined approach which is what I like hence my original Task.Run method (which is actually the method I am currently using) with a throttled producer. Is there anything to watch for with my method? It would be good to know how best to manage the throttling, so what metrics to look for (threads, memory, network utilization etc) given the Task.Run approach in order to get best performance and balance of work – user2231663 May 11 '16 at 21:42
  • 1
    @user2231663, I'd stick with the `Task.Run` approach. It's very easy to throttle using `SemaphoreSlim` like [this](http://stackoverflow.com/a/22493662), or you might be even better off using TPL Dataflow. – noseratio May 12 '16 at 01:30

2 Answers2

7

pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background

Technically, you want concurrency. Whether you want asynchronous concurrency or parallel concurrency remains to be seen...

The work items will be web service calls and database queries.

The work is I/O-bound, so that implies asynchronous concurrency as the more natural approach.

There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work.

The idea of a producer/consumer queue is implied here. That's one option. TPL Dataflow provides some nice producer/consumer queues that are async-compatible and support throttling.

Alternatively, you can do the throttling yourself. For asynchronous code, there's a built-in throttling mechanism called SemaphoreSlim.


TPL Dataflow approach, with throttling:

private static int counter = 1;

static void Main(string[] args)
{
    Console.Title = "Async";
    var x = Task.Run(() => MainAsync());
    Console.ReadLine();          
}

private static async Task MainAsync()
{
  var blockOptions = new ExecutionDataflowBlockOptions
  {
    MaxDegreeOfParallelism = 7
  };
  var block = new ActionBlock<string>(DoSomethingAsync, blockOptions);
  while (true)
  {
    var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
    for (int i = 0; i < 5; i++)
    {
      block.Post(counter.ToString());
      counter++;
      Thread.Sleep(50);
    }
    Thread.Sleep(1000);
  }
}

private static async Task DoSomethingAsync(string jobNumber)
{
  try
  {
    // Simulated mostly IO work - some could be long running
    await Task.Delay(5000);
    Console.WriteLine(jobNumber);
  }
  catch (Exception ex)
  {
    LogException(ex);
  }
  Log("job {0} has completed", jobNumber);
}

Asynchronous concurrency approach with manual throttling:

private static int counter = 1;
private static SemaphoreSlim semaphore = new SemaphoreSlim(7);

static void Main(string[] args)
{
    Console.Title = "Async";
    var x = Task.Run(() => MainAsync());
    Console.ReadLine();          
}

private static async Task MainAsync()
{
  while (true)
  {
    var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
    for (int i = 0; i < 5; i++)
    {
      var x = DoSomethingAsync(counter.ToString());
      counter++;
      Thread.Sleep(50);
    }
    Thread.Sleep(1000);
  }
}

private static async Task DoSomethingAsync(string jobNumber)
{
  await semaphore.WaitAsync();
  try
  {
    try
    {
      // Simulated mostly IO work - some could be long running
      await Task.Delay(5000);
      Console.WriteLine(jobNumber);
    }
    catch (Exception ex)
    {
      LogException(ex);
    }
    Log("job {0} has completed", jobNumber);
  }
  finally
  {
    semaphore.Release();
  }
}

As a final note, I hardly ever recommend my own book on SO, but I do think it would really benefit you. In particular, sections 8.10 (Blocking/Asynchronous Queues), 11.5 (Throttling), and 4.4 (Throttling Dataflow Blocks).

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
1

First of all, let's fix some.

In the second example you are calling

Task.Delay(5000);

without await. It is a bad idea. It creates a new Task instance which runs for 5 seconds but no one is waiting for it. Task.Delay is only useful with await. Mind you, do not use Task.Delay(5000).Wait() or you are going to get deadlocked.

In your second example you are trying to make the DoSomethingAsync method synchronous, lets call it DoSomethingSync and replace the Task.Delay(5000); with Thread.Sleep(5000);

Now, the second example is almost the old-school ThreadPool.QueueUserWorkItem. And there is nothing bad with it in case you are not using some already-async API inside. Task.Run and ThreadPool.QueueUserWorkItem used in the fire-and-forget case are just the same thing. I would use the latter for clarity.

This slowly drives us to the answer to the main question. Async or not async - this is the question! I would say: "Do not create async methods in case you do not have to use some async IO inside your code". If however there is async API you have to use than the first approach would be more expected by those who are going to read your code years later.

Zverev Evgeniy
  • 3,643
  • 25
  • 42