3

I encountered a weird bug in cell vectorization (MATLAB version R2019B).

Please consider the following minimal example, say we generate a cell array with variable length vector in each cell:

N = 10000;
rng(1);
result = cell(N,1);
numConnect = randi(10, [N,1]); % randomly generated number of connected nodes
for i = 1:N
  result{i} = randi(N, [1, numConnect(i)]);
end

Now we want to retrospectively retrieve numConnect, i.e., the length of each cell, we can use cellfun. According to this documentation, in Backward Compatibility mode, you can use string as func variable instead of function handle. However, there is a drastic difference in performance locally.

tic;
nC1 = cellfun('length', result);
toc;

This one usually produces something like

Elapsed time is 0.038531 seconds.

If I changed to @ function handle:

tic;
nC2 = cellfun(@length, result);
toc;

Then

Elapsed time is 1.041925 seconds.

is normal. There is a 30x difference!

I wonder is this performance difference a bug on my local machine, or a "feature" of MATLAB cellfun?

Shuhao Cao
  • 221
  • 1
  • 7
  • 1
    My times are vastly faster than yours (100x), but I do see a 65x time difference. This is normal. There is overhead in calling a function, and there's even more overhead in calling a function through a handle. Using `cellfun` with `'length'` doesn't call a function, it just loops over elements and directly determines the length of each. – Cris Luengo May 25 '20 at 19:54
  • 2
    Yes, this is well known. See [here](http://undocumentedmatlab.com/articles/cellfun-undocumented-performance-boost), [here](https://stackoverflow.com/a/18284028/2586922) and [here](https://medium.com/mathworks/which-way-to-compute-cellfun-or-for-loop-bfedfd4b46c0). By the way, `cellfun` is not vectorization – Luis Mendo May 25 '20 at 20:45
  • Thanks. I see. Kinda lost in the syntax. – Shuhao Cao May 25 '20 at 20:57
  • I have checked in the latest version 2020a and it did not change since. Slower using the function handle. – oro777 May 25 '20 at 22:53

0 Answers0