0

I have a helper method returns IEnumerable<string>. As the collection grows, it's slowing down dramatically. My current approach is to do essentially the following:

var results = new List<string>();
foreach (var item in items)
{
    results.Add(await item.Fetch());
}

I'm not actually sure whether this asynchronicity gives me any benefit (it sure doesn't seem like it), but all methods up the stack and to my controller's actions are asynchronous:

public async Task<IHttpActionResult> FetchAllItems()

As this code is ultimately used by my API, I'd really like to parallelize these all for what I hope would be great speedup. I've tried .AsParallel:

var results = items
    .AsParallel()
    .Select(i => i.Fetch().Result)
    .AsList();
return results;

And .WhenAll (returning a string[]):

var tasks = items.Select(i => i.Fetch());
return Task<string>.WhenAll<string>(tasks).Result;

And a last-ditch effort of firing off all long-running jobs and sequentially awaiting them (hoping that they were all running in parallel, so waiting on one would let all others nearly complete):

var tasks = new LinkedList<Task<string>>();
foreach (var item in items)
    tasks.AddLast(item.Fetch());

var results = new LinkedList<string>();
foreach (var task in tasks)
    results.AddLast(task.Result);

In every test case, the time it takes to run is directly proportional to the number of items. There's no discernable speedup by doing this. What am I missing in using Tasks and await/async?

user655321
  • 1,572
  • 2
  • 16
  • 33
  • Async will not give you benefits of improved performance of your algorithm, rather, it will allow your main thread (UI) to remain responsive while the work is being done. – John Koerner Oct 31 '14 at 02:49
  • 1
    If you tasks CPU bound you should see speedup proportional to number of CPUs, unless your code have something (like large chunks of code under shared `lock`) that completely prevents it to run in parallel or CPU is already at 100% due to other processing. – Alexei Levenkov Oct 31 '14 at 02:54
  • @JohnKoerner: Wouldn't `.AsParallel` give speedup for these async tasks? I originally started looking at tweaking my `foreach` to a `Parallel.ForEach`, then I wandered off a bit into the async/await world. – user655321 Oct 31 '14 at 02:58
  • Yes, `AsParallel` will allow you to use more threads to accomplish the task. Using something like a `ConcurrentBag` may help, also run your tests in Release mode to determine the actual performance of your code. – John Koerner Oct 31 '14 at 03:03
  • Have you fully profiled this code? The finding that there's no discernable speedup even when you parallelize this specific function suggests that the issue may lie elsewhere. – furkle Oct 31 '14 at 03:12
  • 2
    What exactly is `Fetch` doing? Does it execute IO bound work or CPU bound work? – Yuval Itzchakov Oct 31 '14 at 06:19
  • @furkle: `Fetch` is an IO-bound process. It downloads some data from a remote server. – user655321 Oct 31 '14 at 15:30

2 Answers2

2

There's a difference between parallel and concurrent. Concurrency just means doing more than one thing at a time, whereas parallel means doing more than one thing on multiple threads. async is great for concurrency, but doesn't (directly) help you with parallelism.

As a general rule, parallelism on ASP.NET should be avoided. This is because any parallel work you do (i.e., AsParallel, Parallel.ForEach, etc) shares the same thread pool as ASP.NET, so that reduces ASP.NET's capability to handle other requests. This impacts the scalability of your web service. It's best to leave the thread pool to ASP.NET.

However, concurrency is just fine - specifically, asynchronous concurrency. This is where Task.WhenAll comes in. Code like this is what you should be looking for (note that there is no call to Task<T>.Result):

var tasks = items.Select(i => i.Fetch());
return await Task<string>.WhenAll<string>(tasks);

Given your other code samples, it would be good to run through your call tree starting at Fetch and replace all Result calls with await. This may be (part of) your problem, because Result forces synchronous execution.

Another possible problem is that the underlying resource being fetched does not support concurrent access, or there may be throttling that you're not aware of. E.g., if Fetch retrieves data from another web service, check out System.Net.ServicePointManager.DefaultConnectionLimit.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810
  • Perhaps I am just not grokking `async`/`await` and I want to see the result itself in the debugger rather than the Task. If my whole call chain (all the way through to the controller's action) is `async`, can I simply implement `WhenAll` (as in your code example) and improve performance over my current `foreach` implementation? – user655321 Oct 31 '14 at 15:34
  • Yes, as long as the underlying resource permits concurrent access and there's no throttling. – Stephen Cleary Oct 31 '14 at 15:39
0

There is also a configurable limitation on the max connections to a single server that can make download performance independent to the number of client threads.

To change the connection limit use ServicePointManager.DefaultConnectionLimit

Maximum concurrent requests for WebClient, HttpWebRequest, and HttpClient