How many threads to use?

Question

I know there are some existing questions and they provide a very good general perspective on things. I'm hoping to get some details on the C#/VB.Net side for the actual implementation (not philosophy) of some of these perspectives.

My Particular Case

I have a WCF Service which, amongst other things, receives files. For most of the service's life this particular area is actually just sat doing nothing - when work does come it arrives in high bursts of greatly varying quantities.

For each file received (which at a max can be thousands per second) the service needs to work on the files for between 1-10 seconds (each) depending on a number of other services, local resources, and network IO wait times.

To aid the service with these burst workloads I implemented a Queue system. Those thousands of files recieved per second are placed onto the Queue. A controller calculates the number of threads to use based on the size of the queue, up until it reaches a "Peak Max Threads" setting which prevents it from creating additional threads. These threads are placed in a thread pool, and reused to cycle through the queue. The controller will; at intervals; recalculate the number of threads required. If the queue size reduces, a relevant number of threads are released.

The age old problem

How many threads should I peak at? Clearly, adding a new thread everytime a file was received would be silly for lack of a better word - the performance, at best, would deteriorate. Capping the threads when CPU utilization is only 10% across each core, also doesn't seem to be the best use of resources.

So, is there an appropriate way to determine how many threads to cap at? I would rather the service could determine this for itself by sampling available resources, but is there a performance hit from doing so? I know the common answer is to monitor workloads, adjust the counts through trial and error until I find a number I like, but due to the nature of this service (long periods of idle followed by high/burst workloads) it could take a long time to get that kind of information.

What then if we move the server's image to a different host which is faster/slower/different to the first? I have to re-sample the process all over again?

Ideally what I'm after, is for the co-ordinator to intelligently increase the size of the threadpool until CPU utilisation is at x% (would 80% be reasonable? 90%? 99%?). Clearly, I want to do this without adding more threads than is necessary to hit x% otherwise all I'll end up with is threads not just waiting on IO resources, but awaiting each other too.

Thanks in advance!

Related questions (if you want some generic ideas):

How many threads to create?

How many threads is too many?

How many threads to create and when?

A Complication for you

Where would be the fun if I didn't make the problem more difficult?

As it currently stands, the service does hit 100% cpu during these bursts, regularly. The issue is the CPU utilisation spikes. It goes from idle (0-10%) to 100%, and back down again. I'm not sure I can help that - ideally I wouldn't take it all the way to 100%. The problem exists because the files mentioned are in fact images, and part of the services' process is to pass the image through to the System.Windows.Media blackbox which does some complex image processing for me.

There are then lulls in between the spikes because of the IO waits and other processing that goes on. If the spikes hitting 100% can't be helped (and I'm all for knowing how to prevent that, or if I should) how should I aim for the CPU utilisation graph to look? Sat constantly at 100%? Bouncing between 50-100? If I do go through the effort of sampling to decide what does seem to work best, is it guaranteed that switching the virtual servers' host will also work best with the same graph?

This added complexity I won't take into consideration for those of you willing to answer. Feel free to ignore this section. However, any answer that also accounts for this complication, or even answers that just provide tips on how to handle it, I'll at the very least upvote!

Heck of a long question - sorry about that - and thanks for reading so much!!

CPU utilisation of 80-85% is probably your best point. Good usage, but retains some resources to process the background tasks ( like running the service interface ) — Schroedingers Cat, Jul 07 '11 at 09:59
@Schroedingers - I agree with that. Much higher would just be _selfish_ and like you say, even if it was entirely self-centred, all I'd do is slow down the front end of the service. Thanks for pointing it out. — Smudge202, Jul 07 '11 at 10:02
It does of course depend on what else runs there. If it is only your application, then 85% is probably OK. But it is a target, and you will exceed this at times. — Schroedingers Cat, Jul 07 '11 at 10:08
@Schroedingers Absolutely - I can't prevent the CPU exceeding a given percentage. What action should I take if it's exceeded? Do I need to measure how consistently it's exceeded? — Smudge202, Jul 07 '11 at 10:20
Later Windows versions have CPU rate limits on a user account basis. Can you use this to throttle your app? If so, you could just use a large, fixed thread pool and let the OS handle the limiting. — Martin James, Jul 07 '11 at 10:20
That sounds very interesting. Let me go try find some links on the subject - Thanks @Martin — Smudge202, Jul 07 '11 at 10:24
@Martin - the closest I can find to limiting a thread's CPU time is [this question](http://stackoverflow.com/questions/482592/programmatically-limit-cpu-usage-of-a-thread-running-inside-a-service) which is helpful, but from reading your comment, not the same thing. I don't suppose you have better/correct links? (please) — Smudge202, Jul 07 '11 at 10:27
@Smudge Hopefully you can use @Marinos answer below. If you need to do it manually, I would say that you only start new files when the current level is lower than the threshold - given your spiking, probably at a lower level like 80%. You will still hit 100% at times, but it should provide overall good utilisation. — Schroedingers Cat, Jul 07 '11 at 10:28
Maybe the problem is in how my services' process is laid out... The issue I have is when a _Task_ starts, it needs to move the file into memory, make some database/service calls, then pass the in-memory images' reference to System.Windows.Media which inevitably causes the spike. By waiting for CPU to drop below 80% before starting another task, the CPU will actually probably drop much lower - even to idle, because the tasks starts and then sits and waits on IO. Perhaps a refactor to move the IO waits into one queue, and a second queue for the 100% CPU? — Smudge202, Jul 07 '11 at 10:52

Marino Šimić · Accepted Answer · 2011-07-07T18:22:19.517

6

PerformanceCounter allows you to query for processor usage.

However ,have you tried something the framework provides?

        foreach (var file in files)
        {
            var workitem = file;
            Task.Factory.StartNew(() =>
            {
                // do work on workitem
            }, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
        }

You can tune the concurrency level for Tasks in the Task.Factory.

The .NET 4 threadpool by default will schedule the number of threads it finds most performing on the hardware where it runs, but you can change how that works with the previous link.

Probably you need a custom solution but it would be ok to benchmark yours with the standard.

Edit: (comment note):

No links needed, I may have used an invented term since english is not my language. What I mean is: have a variable where you store the variance before the last check (prevDelta), and call it delta. add this to the varuiable avrageDelta and divide by 2, each time you 'check'. You will have the variable averageDelta that will mostly be low since you have no activity. Then have another set of delta variables, one you have already (delta - prevdelta), and store it in a delta variable that is not the average of all deltas but the average of deltas in a small timespan (you will have to come up with an algortihm to calculate accurately this temporal variance). Once done this you can compare the average delta and the 'temporal delta'. The average delta will be mostly low and will slowly go up whjen bursts come. In the same period the temporal delta will go up really fast. Then you have the situation when the burst stops, the average delta goes slowly down, and the 'temporal' goes really fast.

edited Jul 07 '11 at 18:22

answered Jul 07 '11 at 10:10

Marino Šimić

7,318
1
31
61

+1 I had a feeling the framework would have something in built to handle at least some of this, so thank you for showing me where it is! Would you say the Task factory alone could handle my particular case, or would you only use it for benchmarking? Is there a way of measuring the relevant details of the thread pool the factory is using to work out how to tune my custom implementation? – Smudge202 Jul 07 '11 at 10:23
It could handle if you follow the second link and build your scheduler. The rest is just spawning tasks in loops and indicating if they are longrunning and if you rpefer fairness (order of execution). – Marino Šimić Jul 07 '11 at 10:25
1

@Cicada yes but it will use all the cpu it can, the requirement is to try to use up to 80% (on average during bursts). – Marino Šimić Jul 07 '11 at 10:43
1

@smudge : I see your addendum to the question... Cpu spikes will alway happen since CPU utilization is an timed average value. You must calculate how long these bursts are and during these burst try to have an average cpu utilization of 80%. – Marino Šimić Jul 07 '11 at 10:46
There is a down side to this - at the moment the controller is working from a queue. The queue is added to by the front-end. If I use queue.ToList().ForEach(x => or an equivilent, it won't take into account newly added items to the queue. Is there a best way to handle this? – Smudge202 Jul 07 '11 at 10:46
@Marino, thanks for the comment regarding the added spikes complication. I wonder though, if I work on averages, if a spike (which I currently can't control) sits at 100% for 10 seconds, wouldn't I then end up returning to idle until such an interval has passed that the average has been lowered to 80% again? Or have I envisioned these averages wrong? – Smudge202 Jul 07 '11 at 10:48
@Cicada - that is exactly what @Marino has suggested - follow the "Factory" link – Smudge202 Jul 07 '11 at 10:49
You shoudl calculate the temporal delta, and when you see a fast decrease adjust faster... – Marino Šimić Jul 07 '11 at 13:12
@Marino - I can't see anything that looks relevant under the search term "Temporal delta". Would you happen to have a link please? – Smudge202 Jul 07 '11 at 14:27
I'm adding a 'comment note' in my answersince the explanation can't fit in the comment. – Marino Šimić Jul 07 '11 at 18:21

score 2 · Answer 2 · answered Jul 07 '11 at 18:54

2

You could use I/O Completion Ports to asynchronously fetch your images without tying up any threads until it comes time to process what you have fetched.

You could then limit your thread pool based on the number of cores on your client PC, making sure to leave a core free for other processes to use.

answered Jul 07 '11 at 18:54

mbeckish

10,485
5
30
55

Thanks for the advice @mbkeckish - I'll have a good read and get back to you. – Smudge202 Jul 07 '11 at 20:37

score 0 · Answer 3 · answered Jul 07 '11 at 10:05

0

What about a dynamic thread manager that monitors their overall performance and according to this spawns new threads or kills old ones? The main problem here is only how to define the performance measurement function. The rest can be done with a periodically scheduled job that increases or decreases the number of threads according to the previous number of threads and performance in that case or something like that. Maybe also in connection to resources utilization (CPU, disks, network...).

answered Jul 07 '11 at 10:05

gw0

1,533
2
12
13

1

The real challenge, as I see it, is how to access the performance hit. What can be used to get the current CPU usage, which can then be used to monitor and keep it at 80% – Schroedingers Cat Jul 07 '11 at 10:07
Exactly as @Schroedingers has said - I agree that in principal that sounds like a good idea @gw0 (though I'm about to add to the question a slight complication so wait out). The problem is, how? (I think that's what prevent this being a duplicate of the questions I've linked). Has someone done this and are they willing to share snippets and knowledge? =) – Smudge202 Jul 07 '11 at 10:10

How many threads to use?

My Particular Case

The age old problem

A Complication for you

3 Answers3