Distributed-Memory within the same machine using .NET task parallel library?

Question

I have a embarrassingly parallel algorithm that'll run within a Parallel.ForEach block on my 4-core machine in about 5 minutes. When I run the same on an 8-cores machine it ran for much longer (I gave up after 10 minutes, so I don't know exactly how long). Since .NET uses a Shared-Memory architecture for this kind of thing, I'm guessing that access to main memory is creating a bottleneck.

So my question is, is there a way of making n-copies (where n is the number of available cores) of my data and assigning one copy to each core, thus removing the bottleneck?

What I ~~oxymoronically~~ basically want to do is something like distributed memory, but within the same machine.

UPDATE

I re-ran the code on the 8-core machine and while on my 4-core the CPU usage (via Task Manager) will max out for the duration of the run, on the 8-core machine the CPU usage ran at about 50-60% for the duration. I wonder if this is indicative of something?

UPDATE 2 Implemented MPI.NET in my program and I now get 100% CPU usage on all cores, plus I can access cores on other machines.

[It is possible to set processor / thread affinity](http://stackoverflow.com/questions/2510593/how-can-i-set-processor-affinity-in-net). I would likely argue though that the problem lies in the data structure. If the CPU cannot effectively cache the data in L-n cache on the CPU itself then you will see lots of time disappearing on memory fetches and paging of L cache. Does the 8-core machine (assuming physical cores), have less cache? If they are logical cores, they might contend for L cache with each other. — Adam Houldsworth, Dec 10 '13 at 15:54
I do not see evidence that memory bandwidth is the problem. Also, shared data that is read-only can be in multiple CPU caches at the same time. Duplicating it makes matters worse. IOW please post some code so that we can review it for problems. — usr, Dec 10 '13 at 16:12
When it comes to performance, don't guess. Have you tried to use profiling to find out what the problem actually is? And of course, without more details, it's very hard for us to help you. — svick, Dec 10 '13 at 16:35
@AdamHouldsworth Is there any way I can profile memory fetches by the CPU? — mattyB, Dec 13 '13 at 14:10

Distributed-Memory within the same machine using .NET task parallel library?

0 Answers0