I am just learning MATLAB and I find it hard to understand the performance factors of loops vs vectorized functions.
In my previous question: Nested for loops extremely slow in MATLAB (preallocated) I realized that using a vectorized function vs. 4 nested loops made a 7x times difference in running time.
In that example instead of looping through all dimensions of a 4 dimensional array and calculating median for each vector, it was much cleaner and faster to just call median(stack, n) where n meant the working dimension of the median function.
But median is just a very easy example and I was just lucky that it had this dimension parameter implemented.
My question is that how do you write a function yourself which works as efficiently as one which has this dimension range implemented?
For example you have a function my_median_1D
which only works on a 1-D vector and returns a number.
How do you write a function my_median_nD
which acts like MATLAB's median, by taking an n-dimensional array and a "working dimension" parameter?
Update
I found the code for calculating median in higher dimensions
% In all other cases, use linear indexing to determine exact location
% of medians. Use linear indices to extract medians, then reshape at
% end to appropriate size.
cumSize = cumprod(s);
total = cumSize(end); % Equivalent to NUMEL(x)
numMedians = total / nCompare;
numConseq = cumSize(dim - 1); % Number of consecutive indices
increment = cumSize(dim); % Gap between runs of indices
ixMedians = 1;
y = repmat(x(1),numMedians,1); % Preallocate appropriate type
% Nested FOR loop tracks down medians by their indices.
for seqIndex = 1:increment:total
for consIndex = half*numConseq:(half+1)*numConseq-1
absIndex = seqIndex + consIndex;
y(ixMedians) = x(absIndex);
ixMedians = ixMedians + 1;
end
end
% Average in second value if n is even
if 2*half == nCompare
ixMedians = 1;
for seqIndex = 1:increment:total
for consIndex = (half-1)*numConseq:half*numConseq-1
absIndex = seqIndex + consIndex;
y(ixMedians) = meanof(x(absIndex),y(ixMedians));
ixMedians = ixMedians + 1;
end
end
end
% Check last indices for NaN
ixMedians = 1;
for seqIndex = 1:increment:total
for consIndex = (nCompare-1)*numConseq:nCompare*numConseq-1
absIndex = seqIndex + consIndex;
if isnan(x(absIndex))
y(ixMedians) = NaN;
end
ixMedians = ixMedians + 1;
end
end
Could you explain to me that why is this code so effective compared to the simple nested loops? It has nested loops just like the other function.
I don't understand how could it be 7x times faster and also, that why is it so complicated.
Update 2
I realized that using median was not a good example as it is a complicated function itself requiring sorting of the array or other neat tricks. I re-did the tests with mean instead and the results are even more crazy: 19 seconds vs 0.12 seconds. It means that the built in way for sum is 160 times faster than the nested loops.
It is really hard for me to understand how can an industry leading language have such an extreme performance difference based on the programming style, but I see the points mentioned in the answers below.