0

I have a list of tasks that I want to execute in parallel using Parallel.ForEach. It starts fine with 4 tasks running in parallel but in the end it decreases to only one task at a time. Here is the count of parallel tasks in time:

1 2 3 4 4 3 4 4 ... 4 4 4 3 3 1 1 1 1 1 1 1

Max degree of parallelism is set to 4. At the end of execution only one task is executed at one time and all executions run on the same thread. My question is why I am getting this one task at a time execution in the end? How can I avoid this?

Here is the code:

var threadCount = 4;
ThreadPool.SetMinThreads(threadCount, threadCount);
Parallel.ForEach(taskDataList, 
    new ParallelOptions() {MaxDegreeOfParallelism = threadCount},
    (x) => { RunOne(x); });

RunOne function starts external process and waits for it to end. Some suspected that RunOne could have been the problem of lack of parallel execution. To make sure that this is not the case I recreated situation by replacing this function with a sleep call of identical duration. The code is below. Here t is the list of seconds each task takes. activeCount is the number of currently running tasks and remaining is the number of tasks that still remain in the list.

var t = new List<int>()   
{2,2,2,1,1,1,1,1,1,1,
 1,1,1,1,1,3,1,1,1,1,
 1,1,1,1,1,1,1,1,5,4,
 26,12,11,16,44,4,37,26,13,36};
int activeCount = 0;
int remaining = t.Count;
Parallel.ForEach(t, new ParallelOptions() {MaxDegreeOfParallelism = 4},
    (x) =>
    {
        Console.WriteLine($"Active={Interlocked.Increment(ref activeCount)}"+
            $"Remaining={Interlocked.Decrement(ref remaining)} " +
            $"Run thread={Thread.CurrentThread.ManagedThreadId}");
        Thread.Sleep(x * 1000); //Sleep x seconds
        Interlocked.Decrement(ref activeCount);
    });

At the very end it produces output like this:

Active=2 Remaining=7 Run thread=3
Active=1 Remaining=6 Run thread=3
Active=1 Remaining=5 Run thread=3
Active=1 Remaining=4 Run thread=3
Active=1 Remaining=3 Run thread=3
Active=1 Remaining=2 Run thread=3
Active=1 Remaining=1 Run thread=3
Active=1 Remaining=0 Run thread=3

This output shows that in the end only 1 task is running when 6 tasks still remain. With limit of 4 parallel tasks it does not make any sense. When 6 tasks are still available I would expect to see 4 tasks running in parallel.

Should I use Parallel.ForEach differently or is it a bug/feature?

Optional Option
  • 1,521
  • 13
  • 33
  • 2
    What is the code in `RunOne()`? – John Koerner Oct 09 '17 at 20:42
  • RunOne invokes external process and waits for it to complete. – Optional Option Oct 09 '17 at 20:45
  • 1
    The code that you've shown is 100% robust and works correctly. The code you haven't shown - `RunOne(...)` - is probably at fault here; can you please show that method? – Enigmativity Oct 09 '17 at 20:58
  • +1. The problem might be in how you count the number of parallel executions; at least show us this part of the code in RunOne(). – Sergey L Oct 09 '17 at 21:01
  • Questions seeking debugging help ("why isn't this code working?") must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it in the question itself. Questions without a clear problem statement are not useful to other readers. – Enigmativity Oct 09 '17 at 21:25
  • My app is launching external program so I can see the end of execution myself where instead of 4 instances I see 5 times one instance starting. Even thread Id is the same so there is no way I am counting it incorrectly. – Optional Option Oct 09 '17 at 22:28
  • @OptionalOption - And what happens when you remove the code that creates processes? – Enigmativity Oct 10 '17 at 09:10
  • See update 3. I replaced process invocation with sleep of the same duration. There is something wrong with the way Parallel.ForEach distributes tasks or maybe I am just not using it right. But what is the right way? – Optional Option Oct 10 '17 at 14:39
  • @OptionalOption: you still need to show us what the code for `RunOne()` looks like. If you don't think that's the problem, then write a simple for-each loop and spawn a new thread manually to execute `RunOnce()` - and see if it still blocks. – code4life Oct 10 '17 at 15:04
  • Man, I simplified the code into an oblivion. There is a simple sleep instead of RunOne and I still get the same problem. RunOne is not a problem here. On the other hand here is the code void RunOne(int x) { Thread.Sleep(x * 1000); } – Optional Option Oct 10 '17 at 15:05
  • You could also make `RunOne` non-blocking and avoid `Parallel.ForEach`, implementing concurrency instead of parallelism. https://stackoverflow.com/a/31492250/242520 and https://github.com/jamesmanning/RunProcessAsTask – ta.speot.is Oct 14 '17 at 04:38

1 Answers1

1

After looking at reference source of Parallel.ForEach I found out that instead of distributing elements to different threads one by one it splits the list of tasks into chunks and then gives the list of tasks to each thread. It is very inefficient approach for long running tasks

        var t = new List<int>()
            {2,2,2,1,1,1,1,1,1,1,
             1,1,1,1,1,3,1,1,1,1,
             1,1,1,1,1,1,1,1,5,4,
             26,12,11,16,44,4,37,26,13,36};
        int activeCount = 0;
        int remaining = t.Count;
        var cq = new ConcurrentQueue<int>(t);
        var tasks = new List<Task>();
        for (int i = 0; i < 4; i++) tasks.Add(Task.Factory.StartNew(() => 
        {
            int x;
            while (cq.TryDequeue(out x))
            {
                Console.WriteLine($"Active={Interlocked.Increment(ref activeCount)} " +
                    $"Remaining={Interlocked.Decrement(ref remaining)} " +
                    $"Run thread={Thread.CurrentThread.ManagedThreadId}");
                Thread.Sleep(x * 1000); //Sleep x seconds
                Interlocked.Decrement(ref activeCount);
            }
        }));
        Task.WaitAll(tasks.ToArray());

I used 4 parallel tasks as in the first code example. Execution time in this case was 83 seconds when using Parallel.ForEach took 211 seconds. This just proves that Parallel.ForEach is very inefficient in certain cases and that it should be used with caution.

Optional Option
  • 1,521
  • 13
  • 33
  • *I found out that instead of distributing elements to different threads one by one it splits the list of tasks into chunks and then gives the list of tasks to each thread* There is an overload of `Parallel.ForEach` that uses a partitioner you supply and you can partition the tasks however you want e.g. https://msdn.microsoft.com/en-us/library/dd381768(v=vs.110).aspx#Examples – ta.speot.is Oct 14 '17 at 04:32
  • *It is very inefficient approach for long running tasks* There is also the `Task.LongRunning` hint https://msdn.microsoft.com/en-us/library/system.threading.tasks.taskcreationoptions(v=vs.110).aspx – ta.speot.is Oct 14 '17 at 04:33
  • Task,LongRunning hint is not relevant to Parallel.ForEach. Partitioner would work I guess but you are probably one of the two people in the world who knows about it :) I also do not understand the logic behind default partitioning implementation. Too many assumptions for generic method. – Optional Option Oct 14 '17 at 05:12
  • I know it's not but if you're going to use Task.Factory.StartNew with a synchronous method that's long-running you might as well give it the right hint. – ta.speot.is Oct 14 '17 at 05:15
  • Makes sense although if app does not have more threads then it would make no difference I think. – Optional Option Oct 14 '17 at 05:27