What does MaxDegreeOfParallelism do?

Question

I am using Parallel.ForEach and I am doing some database updates, now without setting MaxDegreeOfParallelism, a dual core processor machine results in SQL client timeouts, where else quad core processor machine somehow does not timeout.

Now I have no control over what kind of processor cores are available where my code runs, but is there some settings I can change with MaxDegreeOfParallelism that will probably run less operations simultaneously and not result in timeouts?

I can increase timeouts but it isn't a good solution, if on lower CPU I can process less operations simultaneously, that will put less load on cpu.

Ok I have read all other posts and MSDN too, but will setting MaxDegreeOfParallelism to lower value make my quad core machines suffer?

For example, is there anyway to do something like, if CPU has two cores, then use 20, if CPU has four cores then 40?

score 79 · Accepted Answer · edited Apr 15 '13 at 10:16

The answer is that it is the upper limit for the entire parallel operation, irrespective of the number of cores.

So even if you don't use the CPU because you are waiting on IO, or a lock, no extra tasks will run in parallel, only the maximum that you specifiy.

To find this out, I wrote this piece of test code. There is an artificial lock in there to stimulate the TPL to use more threads. The same will happen when your code is waiting for IO or database.

class Program
{
    static void Main(string[] args)
    {
        var locker = new Object();
        int count = 0;
        Parallel.For
            (0
             , 1000
             , new ParallelOptions { MaxDegreeOfParallelism = 2 }
             , (i) =>
                   {
                       Interlocked.Increment(ref count);
                       lock (locker)
                       {
                           Console.WriteLine("Number of active threads:" + count);
                           Thread.Sleep(10);
                        }
                        Interlocked.Decrement(ref count);
                    }
            );
    }
}

If I don't specify MaxDegreeOfParallelism, the console logging shows that up to around 8 tasks are running at the same time. Like this:

Number of active threads:6
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:6
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7
Number of active threads:7

It starts lower, increases over time and at the end it is trying to run 8 at the same time.

If I limit it to some arbitrary value (say 2), I get

Number of active threads:2
Number of active threads:1
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2
Number of active threads:2

Oh, and this is on a quadcore machine.

My logic does not have any wait or any IO, it just updates SQL, yes SQL might have its own, but mostly I am waiting for SQL to finish. What is the default max no of active threads used? — Akash Kava, Mar 02 '12 at 19:49
The default is 2 per core, but the TPL can raise this if your code is not using the CPU. Most databases involve some amount of IO. — , Mar 02 '12 at 20:01
If my 6-core machine is heavily loaded, it only uses 1 or 2 threads. If its lightly loaded, it goes up to 12. Its intelligent enough to take existing system load into account. — Contango, Aug 08 '13 at 14:05
TPL should be used only when there is no IO involved or you are doing CPU intensive work — Mandeep Janjua, May 17 '18 at 19:18

bugged87 · Answer 2 · 2013-07-26T15:00:04.437

For example, is there anyway to do something like, if CPU has two cores, then use 20, if CPU has four cores then 40?

You can do this to make parallelism dependent on the number of CPU cores:

var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount * 10 };
Parallel.ForEach(sourceCollection, options, sourceItem =>
{
    // do something
});

However, newer CPU's tend to use hyper-threading to simulate extra cores. So if you have a quad-core processor, then Environment.ProcessorCount will probably report this as 8 cores. I've found that if you set the parallelism to account for the simulated cores then it actually slows down other threads such as UI threads.

So although the operation will finish a bit faster, an application UI may experience significant lag during this time. Dividing the `Environment.ProcessorCount' by 2 seems to achieve the same processing speeds while still keeping the CPU available for UI threads.

On my 3950x, even dividing by 2 isn't enough to keep UI threads active. Setting to anything more than 6, and the UI doesn't respond until the whole thing is done, which is annoying. Even 6 can be a tiny bit laggy. If only we could set a thread priority like we can with background threads. — Chuck Savage, Jan 19 '22 at 14:11

Theodor Zoulias · Answer 3 · 2023-06-21T23:07:24.627

The Parallel.ForEach method starts internally a number of Tasks, and each of these tasks repeatedly takes an item from the source sequence and invokes the body delegate for this item. The MaxDegreeOfParallelism can set an upper limit to these internal tasks. But this setting is not the only factor that limits the parallelism. There is also the willingness of the TaskScheduler to execute the tasks that are spawned by the Parallel.ForEach.

The spawning mechanism works by each spawned task replicating itself. In order words the first thing that each task do is to create another task. Most TaskSchedulers have a limit on how many tasks can execute concurrently, and when this limit is reached they queue the next incoming tasks without executing them immediately. So eventually the self-replicating pattern of Parallel.ForEach will stop spawning more tasks, because the last task spawned will be sitting idle in the TaskScheduler's queue.

Let's talk about the TaskScheduler.Default, which is the default scheduler of the Parallel.ForEach, and schedules the tasks on the ThreadPool. The ThreadPool has a soft and a hard limit. The soft limit is when the demand for work is not satisfied immediately, and the hard limit is when the demand for work is never satisfied until an already running workitem completes. When the ThreadPool reaches the soft limit, which is Environment.ProcessorCount by default, it spawns more threads to satisfy the demand at a frequency of one new thread per second¹. The soft limit can be configured with the ThreadPool.SetMinThreads method. The hard limit can be found with the ThreadPool.GetMaxThreads method, and is 32,767 threads in my machine.

So if I configure the Parallel.ForEach in my 4-core machine with MaxDegreeOfParallelism = 20, and the body delegate keeps the current thread busy for more than one second, the effective degree of parallelism will start with 5, then it will gradually increase during the next 15 seconds until it becomes 20, and it will stay at 20 until the completion of the loop. The reason that it starts with 5 instead of 4 is because the Parallel.ForEach uses also the current thread, along with the ThreadPool.

If I don't configure the MaxDegreeOfParallelism, it will be the same as configuring it with the value -1, which means unlimited parallelism. In this case the ThreadPool availability will be the only limiting factor of the actual degree of parallelism. As long as the Parallel.ForEach runs, the ThreadPool will be saturated, in order words it will be in a situation where the supply will be constantly surpassed by the demand. Each time a new thread is spawned by the ThreadPool, this thread will pick the last task scheduled previously by the Parallel.ForEach, which will immediately replicate itself, and the replica will enter the ThreadPool's queue. Provided that the Parallel.ForEach will run for sufficiently long time, the ThreadPool will reach its maximum size (32,767 in my machine), and will stay at this level until the completion of the loop. This assuming that the process will not have crashed already because of lack of other resources like memory.

The official documentation for the MaxDegreeOfParallelism property states that "generally, you do not need to modify this setting". Apparently it has been this way since the introduction of the TPL with .NET Framework 4.0 (2010). At this point you may have started questioning the validity of this advice. So do I, so I posted a question on the dotnet/runtime repository, asking if the given advice is still valid or it's outdated. I was surprised to receive the feedback that the advice is as valid as ever. Microsoft's argument is that limiting the MaxDegreeOfParallelism to the value Environment.ProcessorCount may cause performance regression, or even deadlocks in some scenarios. I responded with a couple of examples demonstrating the problematic behavior that might emerge when an unconfigured Parallel.ForEach runs in an async-enabled application, where other things are happening concurrently with the parallel loop. The demos were dismissed as unrepresentative, because I used the Thread.Sleep method for simulating the work inside the loop.

My personal suggestion is: whenever you use any of the Parallel methods, specify invariably the MaxDegreeOfParallelism, even if you are satisfied with the default. In case you buy my arguments about the undesirability of saturating the ThreadPool, you can configure it with a suitable value like Environment.ProcessorCount. In case you buy Microsoft's arguments about performance and deadlocks, you can configure it with -1. In either case anyone who sees your code, will be hinted that you made a conscious and informed decision.

¹ _{The injection rate of the ThreadPool is not documented. The "one new thread per second" is an experimental observation.}

score 3 · Answer 4 · answered Mar 02 '12 at 18:31

3

It sounds like the code that you're running in parallel is deadlocking, which means that unless you can find and fix the issue that's causing that, you shouldn't parallelize it at all.

answered Mar 02 '12 at 18:31

jimrandomh

895
8
15

-1, Question is not about to Parallel or not to parallel, its simple that SQL does its own calculations but too many parallel requests makes client timeout, I want to run less operations. Deadlock isnt a problem as quad core machine with same logic, same SQL runs fine, I dont want to keep on increasing timeout. – Akash Kava Mar 02 '12 at 18:57
Have you tried increasing the timeout and confirmed that it works? Concurrency issues can be extremely subtle, and many things can cause them to disappear and reappear seemingly at random. The fact that it worked on a different machine with more cores does not mean it isn't broken, or that more cores were the thing that helped. – jimrandomh Mar 03 '12 at 23:36
Increasing timeout does help. But somehow CPU usage is more than 50% on small machines and on big machines its less than 5%, now I am at a point where I need to find out performance issue and is there something I can do to change code or do just need to upgrade CPU. – Akash Kava Mar 04 '12 at 08:53

grep65535 · Answer 5 · 2019-02-11T22:34:46.660

Something else to consider, especially for those finding this many years later, is depending on your situation it's usually best to collect all data in a DataTable and then use SqlBulkCopy toward the end of each major task.

For example I have a process that I made that runs through millions of files and I ran into the same errors when each file transaction made a DB query to insert the record. I instead moved to storing it all in a DataTable in memory for each share I iterated through, dumping the DataTable into my SQL Server and clearing it between each separate share. The bulk insert takes a split second and has the benefit of not opening thousands of connections at once.

EDIT: Here's a quick & dirty working example The SQLBulkCopy method:

private static void updateDatabase(DataTable targetTable)
    {
        try
        {
            DataSet ds = new DataSet("FileFolderAttribute");
            ds.Tables.Add(targetTable);
            writeToLog(targetTable.TableName + " - Rows: " + targetTable.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
            writeToLog(@"Opening SQL connection", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
            Console.WriteLine(@"Opening SQL connection");
            SqlConnection sqlConnection = new SqlConnection(sqlConnectionString);
            sqlConnection.Open();
            SqlBulkCopy bulkCopy = new SqlBulkCopy(sqlConnection, SqlBulkCopyOptions.TableLock | SqlBulkCopyOptions.FireTriggers | SqlBulkCopyOptions.UseInternalTransaction, null);
            bulkCopy.DestinationTableName = "FileFolderAttribute";
            writeToLog(@"Copying data to SQL Server table", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
            Console.WriteLine(@"Copying data to SQL Server table");
            foreach (var table in ds.Tables)
            {
                writeToLog(table.ToString(), logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                Console.WriteLine(table.ToString());
            }
            bulkCopy.WriteToServer(ds.Tables[0]);

            sqlConnection.Close();
            sqlConnection.Dispose();
            writeToLog(@"Closing SQL connection", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
            writeToLog(@"Clearing local DataTable...", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
            Console.WriteLine(@"Closing SQL connection");
            Console.WriteLine(@"Clearing local DataTable...");
            targetTable.Clear();
            ds.Tables.Remove(targetTable);
            ds.Clear();
            ds.Dispose();
        }
        catch (Exception error)
        {
            errorLogging(error, getCurrentMethod(), logDatabaseFile);
        }
    }

...and for dumping it into the datatable:

private static void writeToDataTable(string ServerHostname, string RootDirectory, string RecordType, string Path, string PathDirectory, string PathFileName, string PathFileExtension, decimal SizeBytes, decimal SizeMB, DateTime DateCreated, DateTime DateModified, DateTime DateLastAccessed, string Owner, int PathLength, DateTime RecordWriteDateTime)
    {
        try
        {
            if (tableToggle)
            {
                DataRow toInsert = results_1.NewRow();
                toInsert[0] = ServerHostname;
                toInsert[1] = RootDirectory;
                toInsert[2] = RecordType;
                toInsert[3] = Path;
                toInsert[4] = PathDirectory;
                toInsert[5] = PathFileName;
                toInsert[6] = PathFileExtension;
                toInsert[7] = SizeBytes;
                toInsert[8] = SizeMB;
                toInsert[9] = DateCreated;
                toInsert[10] = DateModified;
                toInsert[11] = DateLastAccessed;
                toInsert[12] = Owner;
                toInsert[13] = PathLength;
                toInsert[14] = RecordWriteDateTime;

                results_1.Rows.Add(toInsert);
            }
            else
            {
                DataRow toInsert = results_2.NewRow();
                toInsert[0] = ServerHostname;
                toInsert[1] = RootDirectory;
                toInsert[2] = RecordType;
                toInsert[3] = Path;
                toInsert[4] = PathDirectory;
                toInsert[5] = PathFileName;
                toInsert[6] = PathFileExtension;
                toInsert[7] = SizeBytes;
                toInsert[8] = SizeMB;
                toInsert[9] = DateCreated;
                toInsert[10] = DateModified;
                toInsert[11] = DateLastAccessed;
                toInsert[12] = Owner;
                toInsert[13] = PathLength;
                toInsert[14] = RecordWriteDateTime;

                results_2.Rows.Add(toInsert);
            }


        }
        catch (Exception error)
        {
            errorLogging(error, getCurrentMethod(), logFile);
        }
    }

...and here's the context, the looping piece itself:

private static void processTargetDirectory(DirectoryInfo rootDirectory, string targetPathRoot)
    {
        DateTime StartTime = DateTime.Now;
        int directoryCount = 0;
        int fileCount = 0;
        try
        {                
            manageDataTables();

            Console.WriteLine(rootDirectory.FullName);
            writeToLog(@"Working in Directory: " + rootDirectory.FullName, logFile, getLineNumber(), getCurrentMethod(), true);

            applicationsDirectoryCount++;

            // REPORT DIRECTORY INFO //
            string directoryOwner = "";
            try
            {
                directoryOwner = File.GetAccessControl(rootDirectory.FullName).GetOwner(typeof(System.Security.Principal.NTAccount)).ToString();
            }
            catch (Exception error)
            {
                //writeToLog("\t" + rootDirectory.FullName, logExceptionsFile, getLineNumber(), getCurrentMethod(), true);
                writeToLog("[" + error.Message + "] - " + rootDirectory.FullName, logExceptionsFile, getLineNumber(), getCurrentMethod(), true);
                errorLogging(error, getCurrentMethod(), logFile);
                directoryOwner = "SeparatedUser";
            }

            writeToRawLog(serverHostname + "," + targetPathRoot + "," + "Directory" + "," + rootDirectory.Name + "," + rootDirectory.Extension + "," + 0 + "," + 0 + "," + rootDirectory.CreationTime + "," + rootDirectory.LastWriteTime + "," + rootDirectory.LastAccessTime + "," + directoryOwner + "," + rootDirectory.FullName.Length + "," + DateTime.Now + "," + rootDirectory.FullName + "," + "", logResultsFile, true, logFile);
            //writeToDBLog(serverHostname, targetPathRoot, "Directory", rootDirectory.FullName, "", rootDirectory.Name, rootDirectory.Extension, 0, 0, rootDirectory.CreationTime, rootDirectory.LastWriteTime, rootDirectory.LastAccessTime, directoryOwner, rootDirectory.FullName.Length, DateTime.Now);
            writeToDataTable(serverHostname, targetPathRoot, "Directory", rootDirectory.FullName, "", rootDirectory.Name, rootDirectory.Extension, 0, 0, rootDirectory.CreationTime, rootDirectory.LastWriteTime, rootDirectory.LastAccessTime, directoryOwner, rootDirectory.FullName.Length, DateTime.Now);

            if (rootDirectory.GetDirectories().Length > 0)
            {
                Parallel.ForEach(rootDirectory.GetDirectories(), new ParallelOptions { MaxDegreeOfParallelism = directoryDegreeOfParallelism }, dir =>
                {
                    directoryCount++;
                    Interlocked.Increment(ref threadCount);
                    processTargetDirectory(dir, targetPathRoot);
                });

            }

            // REPORT FILE INFO //
            Parallel.ForEach(rootDirectory.GetFiles(), new ParallelOptions { MaxDegreeOfParallelism = fileDegreeOfParallelism }, file =>
            {
                applicationsFileCount++;
                fileCount++;
                Interlocked.Increment(ref threadCount);
                processTargetFile(file, targetPathRoot);
            });

        }
        catch (Exception error)
        {
            writeToLog(error.Message, logExceptionsFile, getLineNumber(), getCurrentMethod(), true);
            errorLogging(error, getCurrentMethod(), logFile);
        }
        finally
        {
            Interlocked.Decrement(ref threadCount);
        }

        DateTime EndTime = DateTime.Now;
        writeToLog(@"Run time for " + rootDirectory.FullName + @" is: " + (EndTime - StartTime).ToString() + @" | File Count: " + fileCount + @", Directory Count: " + directoryCount, logTimingFile, getLineNumber(), getCurrentMethod(), true);
    }

Like noted above, this is quick & dirty, but works very well.

For memory-related issues I ran into once I got to around 2,000,000 records, I had to create a second DataTable and alternate between the 2, dumping the records to SQL server between alternation. So my SQL connections consist of 1 every 100,000 records.

I managed that like this:

private static void manageDataTables()
    {
        try
        {
            Console.WriteLine(@"[Checking datatable size] toggleValue: " + tableToggle + " | " + @"r1: " + results_1.Rows.Count + " - " + @"r2: " + results_2.Rows.Count);
            if (tableToggle)
            {
                int rowCount = 0;
                if (results_1.Rows.Count > datatableRecordCountThreshhold)
                {
                    tableToggle ^= true;
                    writeToLog(@"results_1 row count > 100000 @ " + results_1.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                    rowCount = results_1.Rows.Count;
                    logResultsFile = "FileServerReport_Results_" + DateTime.Now.ToString("yyyyMMdd-HHmmss") + ".txt";
                    Thread.Sleep(5000);
                    if (results_1.Rows.Count != rowCount)
                    {
                        writeToLog(@"results_1 row count increased, @ " + results_1.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                        rowCount = results_1.Rows.Count;
                        Thread.Sleep(15000);
                    }
                    writeToLog(@"results_1 row count stopped increasing, updating database...", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                    updateDatabase(results_1);
                    results_1.Clear();
                    writeToLog(@"results_1 cleared, count: " + results_1.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                }

            }
            else
            {
                int rowCount = 0;
                if (results_2.Rows.Count > datatableRecordCountThreshhold)
                {
                    tableToggle ^= true;
                    writeToLog(@"results_2 row count > 100000 @ " + results_2.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                    rowCount = results_2.Rows.Count;
                    logResultsFile = "FileServerReport_Results_" + DateTime.Now.ToString("yyyyMMdd-HHmmss") + ".txt";
                    Thread.Sleep(5000);
                    if (results_2.Rows.Count != rowCount)
                    {
                        writeToLog(@"results_2 row count increased, @ " + results_2.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                        rowCount = results_2.Rows.Count;
                        Thread.Sleep(15000);
                    }
                    writeToLog(@"results_2 row count stopped increasing, updating database...", logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                    updateDatabase(results_2);
                    results_2.Clear();
                    writeToLog(@"results_2 cleared, count: " + results_2.Rows.Count, logDatabaseFile, getLineNumber(), getCurrentMethod(), true);
                }
            }
        }
        catch (Exception error)
        {
            errorLogging(error, getCurrentMethod(), logDatabaseFile);
        }
    }

Where "datatableRecordCountThreshhold = 100000"

"Unnamed": I added some working code examples. The code may not be the best form (style, best-practice-wise), but I was learning how to do these things on the fly and it all works. I'm compiling against 4.6.2 if it matters. — grep65535, Feb 11 '19 at 22:25

score -1 · Answer 6 · answered Mar 02 '12 at 18:29

-1

it sets number of threads to run in parallel...

answered Mar 02 '12 at 18:29

Desolator

22,411
20
73
96

But does it take cores into consideration? – Akash Kava Mar 02 '12 at 18:57
basically what kind of db you are using? – Desolator Mar 02 '12 at 19:08
Same OS, Same Program, Same Data (Replicators bascially) ,but one is a high end machine with dual quad core, and two are simple dual core machines, same program fetches data from other servers and stores data back to SQL (lots of blobs and images). – Akash Kava Mar 02 '12 at 19:18
how many parallel threads you are trying to use? and what is the db platform you are using? – Desolator Mar 02 '12 at 19:27
Well I have list of 100 queries to be executed each parallel, so I did Parallel.ForEach(queryList,execute)... , I think I will try ParallelOptions.MaxDegreeOfParallelism = Environment.ProcessorCount * 20, or like that. – Akash Kava Mar 02 '12 at 19:43
if you are using sqlite db it will throw you an exception. other than that i think you should be fine even with 1000 threads... – Desolator Mar 02 '12 at 19:52
I think this will do ok.. but don't use too much threads.. 100 should be enough... – Desolator Mar 02 '12 at 19:57
4

It definitely does not set the number of threads to run in parallel. In practice, it may specify the maximum number of threads to run in parallel, but that is not its contract. What it actually limits is the number of concurrently running *operations*, which may or may not equate to threads, but that is an abstracted implementation detail. – hemp Jan 03 '13 at 20:42

What does MaxDegreeOfParallelism do?

6 Answers6

Linked