23

I have a Windows Service that runs various jobs based on a schedule. After determining which jobs to run, a list of schedule objects is sent to a method that iterates through list and runs each job. The problem is that some jobs can take up to 10 minutes to run because of external database calls.

My goal is to not have one job block others in queue, basically have more than one run at a time. I thought that using async and await could be to solve this, but I've never used these before.

Current Code:

public static bool Load(List<Schedule> scheduleList)
{
    foreach (Schedule schedule in scheduleList)
    {
        Load(schedule.ScheduleId);
    }

    return true;
}

public static bool Load(int scheduleId)
{
    // make database and other external resource calls 
    // some jobs run for up to 10 minutes   

    return true;
}

I tried updating to this code:

public async static Task<bool> LoadAsync(List<Schedule> scheduleList)
{
    foreach (Schedule schedule in scheduleList)
    {
        bool result = await LoadAsync((int)schedule.JobId, schedule.ScheduleId);
    }

    return true;
}

public async static Task<bool> LoadAsync(int scheduleId)
{
    // make database and other external resource calls 
    // some jobs run for up to 10 minutes   

    return true;
}

The issue is that the first LoadAsync waits for the job to finish before giving control back to the loop instead of allowing all the jobs to start.

I have two questions:

  1. High Level - Are aysnc/await the best choice, or should I use a different approach?
  2. What needs to be updated to allow the loop to kick off all the jobs without blocking, but not allow the function to return until all jobs are completed?
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
Josh
  • 8,219
  • 13
  • 76
  • 123

3 Answers3

31

High Level - Are async/await the best choice, or should I use a different approach?

async-await is perfect for what you're attempting to do, which is concurrently offloading multiple IO bound tasks.

What needs to be updated to allow the loop to kick off all the jobs without blocking, but not allow the function to return until all jobs are completed?

Your loop currently waits because you await each call to LoadAsync. What you want is to execute them all concurrently, than wait for all of them to finish using Task.WhenAll:

public async static Task<bool> LoadAsync(List<Schedule> scheduleList)
{
   var scheduleTaskList = scheduleList.Select(schedule => 
                          LoadAsync((int)schedule.JobId, schedule.ScheduleId)).ToList();
   await Task.WhenAll(scheduleTaskList);

   return true;
}
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
  • Looks like a good solution. When I call LoadAsync from a non-async method, if I wrap that call in a try/catch block, will the exceptions be trapped from each individual job that was run? – Josh Sep 26 '14 at 23:34
  • @Josh If you call from a non-async method you will not get the exceptions until you await the call of the `WhenAll`. – Scott Chamberlain Sep 27 '14 at 00:35
  • Any advantage to your approach opposed to @IUnknown's approach using Parallel.ForEach? – Josh Sep 28 '14 at 12:53
  • 1
    My approach uses no extra threads other than the one you executed `await Task.WhenAll` on. Also, [*`Parallel.ForEach` and `async-await` dont play along nicely*](http://stackoverflow.com/questions/11564506/nesting-await-in-parallel-foreach). your lambda will end up being translated into `async void`, which should be avoided. – Yuval Itzchakov Sep 28 '14 at 12:58
  • hello i want need you r help i am new for the asyn and await and i have used in the for loop but its' correct or not idk can you please help me – Edit Sep 11 '17 at 06:27
  • will this approach spawn 1000s of threads if the collection is big enough? or will the runtime take care of thread pools? – Nandun Oct 04 '18 at 21:39
  • Can the drives access multiple files at the same time? What's the benefit of running all these load operations concurrently. It makes more sense to have a single concurrent task loading each asset synchronously on that one task? – Gavin Williams Mar 15 '21 at 11:42
4

For fan-out parallel async calls, you want to fire off the Task's to start them running, but then handle them as async future or promise values. You can just synchronize / await them at the end when all are finished.

Simplest way to do this is to turn your for-loop into something like this:

List<Task<bool>> jobs = new List<Task<bool>>();
foreach (var schedule in scheduleList)
{
    Task<bool> job = LoadAsync((int) schedule.JobId, schedule.ScheduleId); // Start each job
    jobs.Add(job);
}
bool[] finishedJobStatuses = await Task.WhenAll(jobs); // Wait for all jobs to finish running
bool allOk = Array.TrueForAll(finishedJobStatuses, p => p);
Jorgen Thelin
  • 1,066
  • 9
  • 23
-1

I would recommend using Parallel.ForEach. It is not asynchronous and runs each iteration in parallel. Exactly what you need.

public static bool Load(IEnumerable<Schedule> scheduleList)
{
    // set the number of threads you need
    // you can also set CancellationToken in the options
    var options = new ParallelOptions { MaxDegreeOfParallelism = 5 };

    Parallel.ForEach(scheduleList, options, async schedule =>
    {
        await Load(schedule.ScheduleId);
    });

    return true;
}
IUnknown
  • 335
  • 1
  • 11
  • This produces a compile error -- "Cannot await a bool", so I still need to call an async version of Load that returns Task instead of bool. Is the primary advantage to this approach that I don't need make the calling method that contains the Parallel.ForEach async? I assume error handling is same with this or using an async caller. – Josh Sep 28 '14 at 13:01
  • 3
    The problem with this approach is that it doesn't wait to exit until all the jobs are finished. Using await Task.WhenAll forces the function to wait. – Josh Sep 28 '14 at 13:27