1

I want to use Parallel.ForEach to manipulate one set and return another. But I seem to get an empty one.And There is an asynchronous method that needs to be executed in Parallel.ForEach.

This is a console app with netcore2.2 in windows 10.

        public static ConcurrentBag<int> GetList()
        {
            ConcurrentBag<int> result = new ConcurrentBag<int>() ;
            List<int> list = new List<int> { 1, 2, 3 };
            Parallel.ForEach(list, async i => {
                await Task.Delay(i*1000);
                result.Add(i * 2);
            });
            return result;
        }
        public static void Main(String[] args)
        {
            List<int> list = new List<int>();
            var res = GetList();
            list.AddRange(res);
            Console.WriteLine("Begging.");
            foreach (var item in list)
            {
                Console.WriteLine(item);
            }
            Console.ReadLine();
        }

I expect {2,4,6},but actual an empty one.

zhusp
  • 131
  • 12
  • Why are you using `async` and `await Task.Delay(i*1000);`? `Parallel.ForEach` doesn't play well with `async`. – mjwills Sep 17 '19 at 07:57
  • 1
    What you might want to consider doing is `return list.AsParallel(stuffhere).ToList();`. _I mean, it is pointless, but it will give you parallelism._ – mjwills Sep 17 '19 at 07:58
  • 1
    Don't use `Parallel.ForEach` for async code. It's meant for data parallelism **only** and can't await any tasks. What you did is fire off a bunch of tasks that may not even run before your application terminates. Parallel.ForEach has *no* way to wait for those tasks – Panagiotis Kanavos Sep 17 '19 at 07:59
  • ConcurrentBag is not a good choice either. Unlike the other concurrent collections it's a *specialized* class that uses thread-local storage to allow faster access to the thread that created an object. This means that the main thread will take longer to read the results than if you used eg a ConcurrentQueue – Panagiotis Kanavos Sep 17 '19 at 08:01

1 Answers1

2

await is the culprit. What you need to understand is that await is a fancy return. If the caller doesn't understand tasks, all it sees is the return. Parallel.ForEach doesn't expect a delegate that returns task, so it has no idea how to wait for the await to complete.

The Parallel.ForEach finishes almost as soon as it started, and long before anything gets written to result.

Now, Parallel is supposed to be used for CPU-bound operations, so this is not a problem. If you want to simulate a long-running CPU-bound operation, use Thread.Sleep instead of await Task.Delay.

As a side note, how would you parallelize a task-based operation that is I/O bound? The simplest way would be something like this:

await Task.WhenAll(list.Select(YourAsyncOperation));

Where YourAsyncOperation is an async method returning Task, which can use Task.Delay as much as you want. The main problem with this simple approach is that you must be sure that YourAsyncOperation actually does an await, soon, and ideally doesn't use a synchronization context. In the worst case, all of the calls are going to be serialized. Well, really, in the absolute worst case, you get a deadlock, but... :)

Luaan
  • 62,244
  • 7
  • 97
  • 116