parfeval slower than parfor

Question

I'm comparing execution times between two blocks, one using parfor the other doing the same thing by issuing parfeval and fetching outputs:

parfor k = 1:N
    a = rand(5000);
    b = inv(a);
end

vs.

for k = 1:N
    a = rand(5000);
    F(k) = parfeval(p,'inv',1,a);
end
for k = 1:N
    [completedIdx,value] = fetchNext(F);
    fprintf(1,'%d   ',completedIdx);
end

The parfor is consistently faster. Any insights into why that is the case? My simplistic understanding is parfor essentially runs each loop as a parallel job.

Relevant read: http://stackoverflow.com/questions/32146555/saving-time-and-memory-using-parfor-in-matlab/32146700#32146700 The `inv` operation is indeed implicitly multithreaded, so using `parfor` on that doesn't make it go faster (see the post I linked). — Adriaan, Sep 20 '16 at 12:13
Makes perfect sense. I looked but didn't find inv among the multithreaded functions but I trust you on it. Still I'm not sure if that explains the slower parfeval compared to parfor — Pooya Ir, Sep 21 '16 at 02:45

score 0 · Answer 1 · edited May 23 '17 at 12:33

0

Your understanding is correct.

By running a loop with a parfeval, you aren't getting the advantage of the power of the parallel computing toolbox.

In the first case, the inverse of a 5000 x 5000 matrix would seem like it would be computationally intensive, however MATLAB is optimized for these types of operations (particularly matrix operations).

It is well understood that the one weakness of MATLAB is looping, using parfeval in your second use case means that you are evaluating the inverse of each matrix sequentially (even though you are parallelizing the inverse function).

By using parfor, you gain the advantage of parallelizing the most time consuming aspect of the code.

I would venture that only in cases where size(a) >> N would you see a case where the parfor is outperformed by parfeval.

Edit @Adriaan makes a great point as well. inv is also an implicitly parallelized function, like most MATLAB functions.

edited May 23 '17 at 12:33

Community

1
1

answered Sep 19 '16 at 22:16

zglin

2,891
2
15
26

I think I am not quite understanding your response: "using parfeval in your second use case means that you are evaluating the inverse of each matrix sequentially (even though you are parallelizing the inverse function)" I thought I was parallelizing by calling parfeval, how is it still not parallelized? – Pooya Ir Sep 21 '16 at 02:48
I wouldn't go as far as to say that *most MATLAB functions* are parallelised; I'd rather say that most functions which do not show code when you try `edit ` are precompiled functions (in C++ afaik) and they are optimised in terms of speed already. The full list of implicitly multithreaded functions can be found [here](http://nl.mathworks.com/products/parallel-computing/parallel-support.html;jsessionid=c9625ed2c7026e49e0e6921b2e5d) – Adriaan Sep 21 '16 at 11:51
This is completely wrong. The `parfeval` is parallelizing the inverse. – cfp May 27 '19 at 10:35

score 0 · Answer 2 · answered May 27 '19 at 10:39

Any difference between the two is likely driven by the fact that you are not parallelizing the inverse in the second case.

For me the following two options take the same amount of time.

Init:

p = gcp;
N = p.NumWorkers;

Option A:

tic;
b = zeros( N, 1 );
parfor k = 1 : N;
    b( k ) = max( max( abs( inv( rand( 5000 ) ) ) ) );
end;
toc;

Option B:

tic;
F = repmat(parallel.FevalFuture,N,1);
for k = 1:N;
    F(k) = parfeval( p, @() max( max( abs( inv( rand( 5000 ) ) ) ) ), 1 );
end;
b = fetchOutputs( F );
toc;

parfeval slower than parfor

2 Answers2