1

I'm using a parallel for loop in my code to run a long running process on a large number of entities (12,000).

The process parses a string, goes through a number of input files (I've read that given the number of IO based things the benefits of threading could be questionable, but it seems to have sped things up elsewhere) and outputs a matched result.

Initially, the process goes quite quickly - however it ends up slowing to a crawl. It's possible that it's just hit a number of particularly tricky input data, but this seems unlikely looking closer at things.

Within the loop, I added some debug code that prints "Started Processing: " and "Finished Processing: " when it begins/ends an iteration and then wrote a program that pairs a start and a finish, initially in order to find which ID was causing a crash.

However, looking at the number of unmatched ID's, it looks like the program is processing in excess of 400 different entities at once. This seems like, with the large number of IO, it could be the source of the issue.

So my question(s) is(are) this(these):

  • Am I interpreting the unmatched ID's properly, or is there some clever stuff going behind the scenes I'm missing, or even something obvious?
  • If you'd agree what I've spotted is correct, how can I limit the number it spins off and does at once?

I realise this is perhaps a somewhat unorthodox question and may be tricky to answer given there is no code, but any help is appreciated and if there's any more info you'd like, let me know in the comments.

Joshua Mee
  • 582
  • 5
  • 20
  • You are probably right but, how can we know without seeing what you are doing. Where are the contentions between the threads? – Jodrell Nov 27 '12 at 17:09

2 Answers2

2

Without seeing some code, I can guess at the answers to your questions:

  • Unmatched IDs indicate to me that the thread that is processing that data is being de-prioritized. This could be due to IO or the thread pool trying to optimize, however it seems like if you are strongly IO bound then that is most likely your issue.
  • I would take a look at Parallel.For, specifically using ParallelOptions.MaxDegreesOfParallelism to limit the maximum number of tasks to a reasonable number. I would suggest trial and error to determine the optimum number of degrees, starting around the number of processor cores you have.

Good luck!

Jon Peterson
  • 723
  • 7
  • 21
  • Also, you can take a look at this question: [Limit the number of parallel threads in C#](http://stackoverflow.com/q/8853907/947171) – Jon Peterson Nov 27 '12 at 16:42
  • Thanks, will take a look. The number of unmatched threw me - I was expecting I'd see around the number of cores I had, seeing the number there seemed a little extreme given I thought it was meant to work out the optimal number itself, which I presumed would be as you said, the number of cores on the pc or there abouts. – Joshua Mee Nov 27 '12 at 16:48
  • 3
    For an IO bound task the number of cores shouldn't be a concern when determining max degrees of parallelism; it's simply a question of how many it takes to get the hard drive up to 100% throughput. That could be as low as 2 (one to be processed and another to be fetching from the disk). It may be a few more, but the core count shouldn't be relevant (much). – Servy Nov 27 '12 at 17:09
0

Let me start by confirming that is indeed a very bad idea to read 2 files at the same time from a hard drive (at least until the majority of HDs out there are SSDs), let alone whichever number your whole thing is using. The use of parallelism serves to optimize processing using an actually paralellizable resource, which is the CPU power. If you paralellized process reads from a hard drive then you're losing most of the benefit.

And even then, even the CPU power is not prone to infinite paralellization. A normal desktop CPU has the capacity to run up to 10 threads at the same time (depends of the model obviously, but that's the order of magnitude).

So two things

  • first, I am going to make the assumption that your entities use all your files, but your files are not too big to be loaded into memory. If it's the case, you should read your files into objects (i.e. into memory), then paralellize the processing of your entities using those objects. If not, you're basically relying on your hard drive's cache to not reread your files every time you need them, and your hard drive's cache is far smaller than your memory (1000-fold).

  • second, you shouldn't be running Parallel.For on 12.000 items. Parallel.For will actually (try to) create 12.000 threads, and that is actually worse than 10 threads, because of the big overhead that paralellizing will create, and the fact your CPU will not benefit from it at all since it cannot run more than 10 threads at a time.

You should probably use a more efficient method, which is the IEnumerable<T>.AsParallel() extension (comes with .net 4.0). This one will, at runtime, determine what is the optimal thread number to run, then divide your enumerable into as many batches. Basically, it does the job for you - but it creates a big overhead too, so it's only useful if the processing of one element is actually costly for the CPU.

From my experience, using anything parallel should always be evaluated against not using it in real-life, i.e. by actually profiling your application. Don't assume it's going to work better.

Evren Kuzucuoglu
  • 3,781
  • 28
  • 51
  • Indeed, as mentioned in the question I have taken into consideration that it's often not beneficial when dealing with files and through testing it _is_ faster in parallel. While it does read files regularly, there is enough processing done on what it receives that it seems to be a benefit. Unfortunately, the files are for the most part too large to load into memory - some totaling a number of GB (this is also something I don't have control over). I'll look at AsParallel, thanks. Parallel.For trying to create a thread for each item is news to me I thought it created an optimal no. – Joshua Mee Nov 27 '12 at 17:19