2

I was attempting to use worker threads to speed up a larger algorithm when I noticed that using independent priority queue's on more threads actually slowed performance down. So I wrote a small test case.

In which I query how many threads to start, set each thread to its own processor, and the push and pop a lot of stuff from my priority queues. Each thread owns it's own priority queue, and they're allocated separately so I don't suspect false sharing.

I put the test case here, because it's longer than a snippet. (The processor affinity bit comes from NCrunch)

The priority queue is of my own creation because .NET didn't have a built in queue. It uses a Pairing Heap if that makes any difference.

At any rate if I run the program with one thread and one core, it gets about 100% usage. One core Usage drops with two threads / two cores Two cores And eventually pittles down to 30% usage with all 8 cores. eight cores

Which is a problem because the drop in performance nullifies any would be gain from multithreading. What's causing the drop in performance? Each queue is completely independent of the other thread's

Tocs
  • 752
  • 4
  • 17
  • If you're running off a CPU that has few physical cores (and hyperthreading enabled), that may all be normal. See http://superuser.com/questions/133082/hyper-threading-and-dual-core-whats-the-difference and http://superuser.com/questions/420329/single-threaded-program-takes-too-low-cpu – Simon Mourier Feb 19 '13 at 16:53
  • Hmm I'll have to try disabling hyperthreading. But, if I do something like http://stackoverflow.com/questions/39395/how-do-i-calculate-pi-in-c Calculate Pi, I get 100% across the board. – Tocs Feb 19 '13 at 17:02
  • @Tocs Please post your results - I've been having a similar issue and didn't realize hyper-threading may be to blame. – AngryHacker Feb 19 '13 at 17:44
  • The call to `Console.Write` is a synchronization point. Does removing it affect performance at all? – Jim Mischel Feb 19 '13 at 17:51
  • @JimMischel Not noticeably, it's overshadowed by the priorityqueue being slow. – Tocs Feb 19 '13 at 18:14
  • Sorry @AngryHacker the issue was memory, and since this is my work computer I don't really want to fiddle around with turning off hyper-threading since the issue was memory. – Tocs Feb 22 '13 at 14:28

2 Answers2

2

Some problems like solving pi are more suited to parallelization and hyperthreading can actually give you a speed up. When you are dealing with a heavy memory problem like you are, Hyperthreading can't help and can actually hurt. Check out "pipelining" in CPU architecture.

There are not many practical problems for which you can get a 2x speedup by using 2-cpus. The more cpus, the more overhead. In your test case algorithm, I suspect cores are having to wait for the memory subsystem. If you tweak the memory requirements, you will see an increase in performance (and utilization) as you move the memory requirements closer to the CPU cache-size.

matt-dot-net
  • 4,204
  • 21
  • 24
  • I could see this being a problem, the queue's heap isn't exactly local in memory. It alloc's small pieces of memory often, so it would be scattered. I'll try to find a more cache friendly heap implementation and see if that improves it any. – Tocs Feb 19 '13 at 18:32
  • For someone having the same issue with a priority queue, I've written a rough implementation of a QuickHeap, which is a cache oblivious heap. Meaning it makes optimal use of a cache without knowing it's size. http://pastie.org/6316709. Using this priority queue I see nearly 100% CPU usage while pushing and popping. Memory was indeed the problem. – Tocs Feb 22 '13 at 14:26
0

The OS is assigning the processing to whichever CPU it wishes to at a moment. Therefore, you are seeing every processor do some amount of work.

Additionally, when you say "drop in performance", have you checked how many contentions the system is creating? You probably are relieving yourself of contentions amongst the threads as well.

Srikant Krishna
  • 881
  • 6
  • 11