0

I have a sample code in matlab for which I can check the running time only for the parallel part of my code as you see below:

 N = 16;

c = parcluster('local');

c.NumWorkers = N;

parpool(c , c.NumWorkers);

tic;

M = 32;

parfor ii = 1 : M

    A = rand(10^4,10^3);
    B = rand(10^3,10^4);
    C{ii} = A*B;

end

time = toc;

[time]

delete(gcp);

The problem is that when I change "N" (NumWokers) to "32" the speed(time) is the same as for "N=16"!! I have even changed the parallel preferences from 16 to 32. I have access to many clusters with 64 cores, 48 and ... and the result is the same!! Any help would be appreciated.

Ander Biguri
  • 35,140
  • 11
  • 74
  • 120
sadegh
  • 1
  • 1
  • Do you have that many clusters? They are not ficticius, they are hardware. If I tell my 5 yeard oldlaptop to run 300 clusters it wont, because it cant. – Ander Biguri Mar 19 '16 at 17:46
  • 4
    [you are running multi-core function in your parfor loop](http://stackoverflow.com/a/35236386/2732801) – Daniel Mar 19 '16 at 17:49
  • [related](http://stackoverflow.com/questions/32146555/saving-time-and-memory-using-parfor-in-matlab/32146700#32146700). Also: MATLAB doesn't do hyperthreading, it only considers physical cores. Additionally clusters != cores. For clusters you need to use the Distributed Computing Toolbox. – Adriaan Mar 19 '16 at 17:50
  • 1
    I am a PhD student and have access to these clusters provided by my university! – sadegh Mar 19 '16 at 17:51
  • I am new to this and maybe what I said does not make sense! I mean, I have access through "ssh" to a system called "jaynes" and it has 64 cores (I believe!). I ran the above code using that system and I saw no difference between "N=16" and "N=32". How can I use all of those 64 cores? – sadegh Mar 19 '16 at 17:58
  • Did you read the answer I linked? I assume all cores are already in use with N=16. – Daniel Mar 19 '16 at 20:33
  • Daniel Thank you so much for that. I read that and I agree with the overhead. Since I am new to matlab parallelism (and honestly did not understand some of the technical things that you had explained :) ), I simply want to know if there is any way that I can run the above code with a speed for example 2X faster than the case "N=16"? (As I said, I have access to a system with 64 cores provided by my department). – sadegh Mar 20 '16 at 01:36
  • @sadegh: My point was not the overhead, I directly linked my answer in the other question. Matrix multiplication uses multiple cores, probably all cores are already in use. I don't see any potential for an improvement. – Daniel Mar 20 '16 at 20:06

0 Answers0