5

So reserve is quite useful when you have a rough idea of your size requirements. Does anyone know of a similar method to pre-allocate arrays in MATLAB?

I'm not really interested in hacky (but effective) methods like the following:

x = zeros(1000,1);
for i = 1:10000
    if i > numel(x)
       x = [x;zeros(size(x))];
    end
    x(i) = rand;
end
x(i+1:end) = [];
Jacob
  • 34,255
  • 14
  • 110
  • 165
  • 2
    you might find this answer useful: http://stackoverflow.com/a/1549094/97160 – Amro Feb 13 '12 at 17:08
  • @Amro: Yep, great answer as always. But I was hoping there was some "magic" MATLAB function I had overlooked. – Jacob Feb 13 '12 at 17:29

3 Answers3

3

The "hacky" way is the only way to do it. However, you do not need to check i <= numel(x). The array will be expanded automatically (but without array doubling):

x = zeros(1000,1);
for i = 1:10000
    x(i) = rand;
end
x(i+1:end) = [];

EDIT: To keep it simple while still retaining the array doubling, you can write a class, or simply a few helper functions (below).

EDIT2: The usage of helper functions will slow things down compared to the manual hack. In MATLAB 2010 it is still much faster than naive growth. In MATLAB 2011 the naive approach is actually faster, suggesting that this version has smarter allocation. Perhaps it is fast enough so that no hack is needed at all. Thanks to Andrew Janke for pointing this out.

function listtest()
    n = 10000;
    l = new_list();
    for i=1:n
        l = list_append(l, i);
    end
    a = list_to_array(l);
end

function l = new_list()
    l = [0 0];
end
function l = list_append(l, e)
    if l(1)+1 == length(l)
        l(length(l)*2) = 0;
    end
    l(1) = l(1)+1;
    l(l(1)+1) = e;
end
function a = list_to_array(l)
    a = l(2:1+l(1));
end

EDIT (from AndrewJanke)

Here's code to compare the speed of the implementations.

function manual_reserve_example(n)
x = zeros(1000,1);
for i = 1:n
    if i > numel(x)
       x = [x;zeros(size(x))];
    end
    x(i) = i;
end
x(i+1:end) = [];
end

function naive_growth(n)
x = 0;
for i = 1:n
    x(i) = i;
end
end

function compare_them(n)
fprintf('Doing %d elements in Matlab R%s\n', n, version('-release'));
tic;
naive_growth(n);
fprintf('%30s  %.6f sec\n', 'naive_growth', toc);
tic;
manual_reserve_example(n);
fprintf('%30s  %.6f sec\n', 'manual_reserve', toc);
tic;
listtest(n);
fprintf('%30s  %.6f sec\n', 'listtest', toc);
end
rasmus
  • 3,136
  • 17
  • 22
  • 2
    The check in the original question implemented an algorithm to double the allocation size on overflow, resulting in `O(log(n))` reallocations (4 in this case). Matlab's natural growth only grows one element at a time, for 9000 reallocations in this example. – Pursuit Feb 13 '12 at 17:23
  • @Pursuit: Exactly. This is what `reserve` does as well. – Jacob Feb 13 '12 at 17:28
  • rasmus: -1 Nope. The function call overhead (and probable defeating of the JIT's in-place optimizations) makes this helper function implementation much slower than Jacob's manual expansion, or even the naive non-preallocated `x=0; for i=1:n; x(i)=i; end` code. Did you test this? In Matlab, and other languages for that matter, you need to consider the cost of operations and actually measure the performance of code you think is an optimization. Making a class would be even worse due to higher overhead. – Andrew Janke Feb 13 '12 at 19:58
  • And please don't use lower case "l" as a variable name. It's hard to distinguish "l" (lower ell) and "1" (numeral one), especially in fixed width fonts like the Matlab editor's. Consider `l(1) = l(1)+1; l(l(1)+1) = e;`. That's a bit hard to read. – Andrew Janke Feb 13 '12 at 20:02
  • @AndrewJanke Yes I did test this, and it is fast on MATLAB 2010, much faster than without preallocation. The question is did you test it? – rasmus Feb 13 '12 at 20:33
  • Also, I do not find that statement hard to read at all. The numbers are even in a different color. If you are having trouble, feel free to edit the variable name. – rasmus Feb 13 '12 at 20:44
  • I did, though in R2011b. (Win XP x64, 3GHz Intel X9650.) I've edited the Q to add the test code. I tried it in R2010b and R2009b, too; for n=10000, listtest and naive_growth are both about 100 msec, and manual_reserve (Jacob's example code) is about 1 msec. In R2011b, naive_growth gets faster and functions get slower: ~1 msec for manual_reserve, ~5 msec for naive_growth, ~125 msec for listtest. Which R2010 are you on, and what times are you seeing? – Andrew Janke Feb 13 '12 at 21:03
  • Thanks for the thorough evaluation. There was a mismatch between the code I put here and that I was using. I fixed the if-statement in list_append, it was off by one. Will post my timings using this code. You could try it as well. – rasmus Feb 13 '12 at 21:10
  • Matlab R2010a. naive_growth 6.734811 sec, manual_reserve 0.013166 sec, listtest 0.251233 sec. 100000 elements. Manual is faster (as expected), but list is more convenient and faster than naive. – rasmus Feb 13 '12 at 21:13
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/7701/discussion-between-andrew-janke-and-rasmus) – Andrew Janke Feb 14 '12 at 17:53
1

The cleanest solution to the example that you provided is to iterate backwards.

for i = 10000:-1:1
    x(i) = rand;
end

This does not work in cases where the end size is actually unknown, but it has come in handy for me more often than I would have expected.


Otherwise I usually implement a "double on overflow" algorithm like you show in the original question.

The clean solution is to wrap a Matlab class around a respectible vector resize algorithm, and then use that class. I am not aware of any reason such a class could not be built, but I've never actually sat down and tried to implement one. (I'm curious if an example already exists on the file exchange somewhere.)

Pursuit
  • 12,285
  • 1
  • 25
  • 41
  • some implementations were suggested by @woodchips here: http://stackoverflow.com/a/3251547/97160 – Amro Feb 13 '12 at 17:38
0

There is a way to preallocate memory for a structure in MATLAB 7.6 (R2008a) using the STRUCT and REPMAT commands.

EXAMPLE 1: A structure with two fields

s.field1 s.field2

s = struct('field1',cell(1),'field2',cell(1));

EXAMPLE 2: A structure with a field with a subfield

s.field1.subfield

s = struct('field1',struct('subfield',cell(1)));

EXAMPLE 3: An array of structures

v(1).field1 ... v(100).field1

s = struct('field1',cell(1));
v = repmat(s,100,1);
groovekiller
  • 1,122
  • 2
  • 8
  • 20
  • That doesn't work. Look at the number of bytes allocated to `v` with `whos v` as you add data to it ; it isn't pre-allocated. E.g. `whos v;v(1).field1 = 10;whos v;v(2).field1 = 20;whos v` – Jacob Feb 16 '12 at 19:06