8

This is a follow-up question to How to append an element to an array in MATLAB? That question addressed how to append an element to an array. Two approaches are discussed there:

A = [A elem]  % for a row array
A = [A; elem] % for a column array

and

A(end+1) = elem;

The second approach has the obvious advantage of being compatible with both row and column arrays.

However, this question is: which of the two approaches is fastest? My intuition tells me that the second one is, but I'd like some evidence for or against that. Any idea?

Community
  • 1
  • 1
jub0bs
  • 60,866
  • 25
  • 183
  • 186
  • I've just come across the following question, one answer to which addresses performance issues: http://stackoverflow.com/questions/16188058/matlab-add-element-to-vector – jub0bs Dec 26 '14 at 13:38

3 Answers3

6

The second approach (A(end+1) = elem) is faster

According to the benchmarks below (run with the timeit benchmarking function from File Exchange), the second approach (A(end+1) = elem) is faster and should therefore be preferred.

Interestingly, though, the performance gap between the two approaches is much narrower in older versions of MATLAB than it is in more recent versions.

R2008a

enter image description here

R2013a

benchmark run in MATLAB R2013a

Benchmark code

function benchmark

    n = logspace(2, 5, 40);
    % n = logspace(2, 4, 40); 
    tf = zeros(size(n));
    tg = tf;

    for k = 1 : numel(n)
        x = rand(round(n(k)), 1);

        f = @() append(x);
        tf(k) = timeit(f);

        g = @() addtoend(x);
        tg(k) = timeit(g);
    end

    figure
    hold on
    plot(n, tf, 'bo')
    plot(n, tg, 'ro')
    hold off
    xlabel('input size')
    ylabel('time (s)')
    leg = legend('y = [y, x(k)]', 'y(end + 1) = x(k)');
    set(leg, 'Location', 'NorthWest');
end

% Approach 1: y = [y, x(k)];
function y = append(x)
    y = [];
    for k = 1 : numel(x);
        y = [y, x(k)];
    end
end

% Approach 2: y(end + 1) = x(k);
function y = addtoend(x)
    y = [];
    for k = 1 : numel(x);
        y(end + 1) = x(k);
    end
end
jub0bs
  • 60,866
  • 25
  • 183
  • 186
2

How about this?

function somescript

RStime = timeit(@RowSlow)
CStime = timeit(@ColSlow)
RFtime = timeit(@RowFast)
CFtime = timeit(@ColFast)

function RowSlow

rng(1)

A = zeros(1,2);
for i = 1:1e5
    A = [A rand(1,1)];
end

end

function ColSlow

rng(1)

A = zeros(2,1);
for i = 1:1e5
    A = [A; rand(1,1)];
end

end

function RowFast

rng(1)

A = zeros(1,2);
for i = 1:1e5
    A(end+1) = rand(1,1);
end

end

function ColFast

rng(1)

A = zeros(2,1);
for i = 1:1e5
    A(end+1) = rand(1,1);
end

end

end

For my machine, this yields the following timings:

RStime =

30.4064

CStime =

29.1075

RFtime =

0.3318

CFtime =

0.3351

The orientation of the vector does not seem to matter that much, but the second approach is about a factor 100 faster on my machine.

Wouter Kuijsters
  • 840
  • 6
  • 20
  • You should use `timeit` instead of `tic`/`toc`, though; [the latter is not a reliable way of measuring execution time](http://stackoverflow.com/questions/18952552/timing-code-in-matlab). – jub0bs Dec 26 '14 at 13:28
  • Thanks for your suggestion. Strangely, for me the speedup is still a factor ~100, whilst running your code only gives a speedup of about a factor 6. – Wouter Kuijsters Dec 26 '14 at 13:39
  • Hmm... weird. I'll try running your benchmark on my machine and I'll report back. +1 anyway. – jub0bs Dec 27 '14 at 05:05
  • @Jubobs: really? That's weird, `rng` is the function that controls the output of `rand`, `randi` and `randn`, and is a core matlab function. I included it in my (R2014b) code to make sure all timings used the same random vector. Perhaps you use an older version of matlab in which the command is not available, then you could use older syntaxes found here: http://nl.mathworks.com/help/matlab/math/updating-your-random-number-generator-syntax.html – Wouter Kuijsters Jan 26 '15 at 17:56
  • I only ran your code in MATLAB R2008a; perhaps it wasn't there yet. I'll try in a more recent version. – jub0bs Jan 26 '15 at 18:15
1

In addition to the fast growing method pointing out above (i.e., A(k+1)), you can also get a speed increase from increasing the array size by some multiple, so that allocations become less as the size increases.

On my laptop using R2014b, a conditional doubling of size results in about a factor of 6 speed increase:

>> SO

GATime =
    0.0288

DWNTime =
    0.0048

In a real application, the size of A would needed to be limited to the needed size or the unfilled results filtered out in some way.


The Code for the SO function is below. I note that I switched to cos(k) since, for some unknown reason, there is a large difference in performance between rand() and rand(1,1) on my machine. But I don't think this affects the outcome too much.

function [] = SO()

    GATime  = timeit(@GrowAlways)
    DWNTime = timeit(@DoubleWhenNeeded)

end


function [] = DoubleWhenNeeded()

    A     = 0;
    sizeA = 1;
    for k = 1:1E5
        if ((k+1) > sizeA)
            A(2*sizeA) = 0;
            sizeA = 2*sizeA;
        end
        A(k+1) = cos(k);
    end

end

function [] = GrowAlways()

    A     = 0;
    for k = 1:1E5
        A(k+1) = cos(k);
    end

end
TroyHaskin
  • 8,361
  • 3
  • 22
  • 22
  • Yep, good point (+1). The problem is that, sometimes, you really only want to add just one element without having to pad with zeros. – jub0bs Dec 27 '14 at 05:06
  • True. Knowing the final size of the array is best practice, in my opinion. But since the question was asking how to efficiently grow an array, I figured the final size of the array was unknown. I got this technique from watching a lecture discussing how Python grows lists in constant-ish time without knowing how large the array will be. – TroyHaskin Dec 27 '14 at 07:26