371

I have a Parallel.ForEach() async loop with which I download some webpages. My bandwidth is limited so I can download only x pages per time but Parallel.ForEach executes whole list of desired webpages.

Is there a way to limit thread number or any other limiter while running Parallel.ForEach?

Demo code:

Parallel.ForEach(listOfWebpages, webpage => {
  Download(webpage);
});

The real task has nothing to do with webpages, so creative web crawling solutions won't help.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
eugeneK
  • 10,750
  • 19
  • 66
  • 101
  • @jKlaus If the list isn't modified e.g. it's just a set of URLs, I can't really see the issue? – Shiv Feb 11 '16 at 04:54
  • @Shiv, given enough time you will... Count your number of executions and compare it to the count of the list. – jKlaus Feb 11 '16 at 13:49
  • @jKlaus What are you saying will go wrong? – Shiv Feb 16 '16 at 01:06
  • @Shiv, execute this a few times.. https://dotnetfiddle.net/maKiI5 – jKlaus Feb 16 '16 at 14:54
  • 1
    @jKlaus you are modifying a non-threadsafe element (the integer). I would expect it to not work in that scenario. The OP on the other hand is not modifying anything that needs to be threadsafe. – Shiv Feb 18 '16 at 05:21
  • @Shiv, Are you positive? I haven't seen the source code for Download(). – jKlaus Feb 18 '16 at 14:24
  • @jKlaus Yes Download() has no reference to listOfWebpages – Shiv Feb 25 '16 at 22:54
  • 2
    @jKlaus Here is an example of Parallel.ForEach that sets the count correctly > https://dotnetfiddle.net/moqP2C. MSDN Link: https://msdn.microsoft.com/en-us/library/dd997393(v=vs.110).aspx – jhamm Apr 13 '16 at 21:06
  • @jKlaus - so... you should delete your comments / this whole chain is misleading... what you initially pointed out is not actually a problem with the above code, since he's passing the single current loop item to the method. There's no sharing of variables between threads/loop-executions. – Don Cheadle May 16 '18 at 21:30
  • `Parallel.ForEach` is not suitable for throttling I/O operations. Look at this question for proper solutions: [How to limit the amount of concurrent async I/O operations?](https://stackoverflow.com/questions/10806951/how-to-limit-the-amount-of-concurrent-async-i-o-operations) – Theodor Zoulias Feb 17 '20 at 11:53

5 Answers5

689

You can specify a MaxDegreeOfParallelism in a ParallelOptions parameter:

Parallel.ForEach(
    listOfWebpages,
    new ParallelOptions { MaxDegreeOfParallelism = 4 },
    webpage => { Download(webpage); }
);

MSDN: Parallel.ForEach

MSDN: ParallelOptions.MaxDegreeOfParallelism

Nick Butler
  • 24,045
  • 4
  • 49
  • 70
  • 82
    It may not apply to this particular case but I figured I'd throw it out in case anyone wonders across this and finds it useful. Here I am utilizing 75% (rounded up) of the processor count. `var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) };` – jKlaus Dec 02 '15 at 18:18
  • 7
    Just to save anyone else having to look it up in the documentation, passing a value of `-1` is the same as not specifying it at all: _"If [the value] is -1, there is no limit on the number of concurrently running operations"_ – stuartd Aug 05 '16 at 16:55
  • 1
    It's not clear to me from documentation - does setting MaxDegreeOfParallelism to 4 (for instance) mean there'll be 4 threads each running 1/4th of the loop iterations (one round of 4 threads dispatched), or does each thread still do one loop iteration and we're just limiting how many run in parallel? – Hashman Mar 07 '17 at 19:27
  • 17
    To be clear cores and threads are not the same thing. Depending on the CPU, there are a different number of threads per core, usually 2 per core. For example, if you have a 4 core CPU with 2 threads per core, then you have a max of 8 threads. To adjust @jKlaus comment `var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 2.0)) };`. Link to threads vs cores - https://askubuntu.com/questions/668538/cores-vs-threads-how-many-threads-should-i-run-on-this-machine – Agrejus Jun 06 '18 at 14:45
60

You can use ParallelOptions and set MaxDegreeOfParallelism to limit the number of concurrent threads:

Parallel.ForEach(
    listOfwebpages, 
    new ParallelOptions{MaxDegreeOfParallelism=2}, 
    webpage => {Download(webpage);});     
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
rikitikitik
  • 2,414
  • 2
  • 26
  • 37
25

Use another overload of Parallel.Foreach that takes a ParallelOptions instance, and set MaxDegreeOfParallelism to limit how many instances execute in parallel.

Richard
  • 106,783
  • 21
  • 203
  • 265
17

And for the VB.net users (syntax is weird and difficult to find)...

Parallel.ForEach(listOfWebpages, New ParallelOptions() With {.MaxDegreeOfParallelism = 8}, Sub(webpage)
......end sub)  
user3496060
  • 800
  • 10
  • 20
2

I think the more dynamic and realistic approach would be to limit it by the processor count, so on each system it would function properly:

var options = new ParallelOptions { MaxDegreeOfParallelism = Environment.ProcessorCount };
Parallel.ForEach(myList, options, iter => { });

perhpas yu would multiply Environment.ProcessorCount or divide it to put or take more pressure from the CPU

AliSalehi
  • 159
  • 1
  • 1
  • 13