I have a sample code in matlab for which I can check the running time only for the parallel part of my code as you see below:
N = 16;
c = parcluster('local');
c.NumWorkers = N;
parpool(c , c.NumWorkers);
tic;
M = 32;
parfor ii = 1 : M
A = rand(10^4,10^3);
B = rand(10^3,10^4);
C{ii} = A*B;
end
time = toc;
[time]
delete(gcp);
The problem is that when I change "N" (NumWokers) to "32" the speed(time) is the same as for "N=16"!! I have even changed the parallel preferences from 16 to 32. I have access to many clusters with 64 cores, 48 and ... and the result is the same!! Any help would be appreciated.