11

One of the first things one learns about programming efficiently in MATLAB is to avoid dynamically resizing arrays. The standard example is as follows.

N = 1000;

% Method 0: Bad
clear a
for i=1:N
    a(i) = cos(i);
end

% Method 1: Better
clear a; a = zeros(N,1);
for i=1:N
    a(i) = cos(i)
end

The 'Bad' variant here requires O(N^2) time to run, as it must allocate a new array and copy the old values at each iteration of the loop.

My own preferred practice when debugging is to allocate an array with NaN, harder to confuse with a valid value than 0.

% Method 2: Easier to Debug
clear a; a = NaN(N,1);
for i=1:N
    a(i) = cos(i)
end

However, one would naively think that once our code is debugged, we're wasting time by allocating an array and then filling it with 0 or NaN. As noted here, you can perhaps create an uninitialized array as follows

% Method 3 : Even Better?
clear a; a(N,1) = 0;
for i=1:N
    a(i) = cos(i);
end

However, in my own tests (MATLAB R2013a), I notice no appreciable difference between methods 1 and 3, while method 2 takes more time. This suggests that MATLAB has avoided explicitly initializing the array to zero when a = zeros(N,1) is called.

Thus, I'm curious to know

  • What is the optimal way to preallocate an (uninitialized) array in MATLAB? (Most importantly, large arrays)
  • Does this also hold in Octave?
Patrick Sanan
  • 2,375
  • 1
  • 23
  • 25
  • This is interesting. I thought maybe MATLAB was not initializing the zeros until a modification of the matrix was done (similar to how matlab copies matrices), but `tic; a = NaN(1e4); a(1) = 1; toc` is indeed slower than `tic; a = zeros(1e4); a(1) = 1; toc` on my machine. Just as a heads up though, I've really only seen preallocation done with `zeros` so I'm pretty certain there isn't a way to preallocate without initialization unless you were to make a mex routine, but maybe others here will know. – JustinBlaber Aug 16 '13 at 15:30
  • 2
    This is fast becoming a Matlab FAQ and aspects of the question have already been covered here on SO. Elsewhere too, such as on the invaluable Undocumented Matlab blog -- http://undocumentedmatlab.com/blog/allocation-performance-take-2/#more-4086 The comparative speed of the various approaches seem to change as Matlab develops. – High Performance Mark Aug 16 '13 at 16:22
  • @Shai this is about performance of pre-allocation methods, not about the need for pre-allocation. Stop closing questions like this please. – EJG89 Jul 31 '14 at 09:05
  • @EJG89 I feel that the questions are close enough to justify "closing as dup". I understand you disagree with me and this is fine. For these kind of disagreements there is the option to vote for reopen. You are more than welcome to vote for reopen. – Shai Jul 31 '14 at 09:16
  • Close enough? How are different ways of memory pre-allocation and their performance a duplicate of the need of memory pre-allocation. The discussion is on a way more advanced level and I just don't get why you would close questions this hasty. I probably don't have the rep to vote for re-open so you are just playing on your own it seems. You just don't want people to discuss the deeper layers/inner-working of MatLab it seems. The other topic didn't even touch JIT I think while this is the basis of this question. – EJG89 Jul 31 '14 at 09:17
  • @EJG89 I hear you. no need to be upset. ~2K rep from now and you will be able to cast re-open votes. I'll try and help you get there ;) – Shai Jul 31 '14 at 09:25

2 Answers2

8

The test

Using MatLab 2013b I and an Intel Xeon 3.6GHz + 16GB RAM I ran the code below to profile. I distinguished 3 methods and only considered 1D arrays, i.e. vectors. Methods 1 and 2 have been tested using both column vectors and row vectors, i.e. (n,1) and (1,n).

Method 1 (M1R, M1C)

a = zeros(1,n);

Method 2 M2R, M2C

a = NaN(1,n);

Method 3 (M3)

a(n) = 0;

Results

The timing results and number of elements have been plotted on a duuble logarithmic scale in figure timing1D.

timings1d

As shown the third method has assignment almost independent of vector size while the other steadily increase suggesting an implicit definition of the vector.

Discussion

MatLab does a lot of code optimization using JIT (Just in time), i.e. code optimization during run-time. So it is a valid question to pose whether or not the part of the code running faster is due to programming (always the same whether or not optimized) or due to optimization. To test this optimization can be turned off by using feature('accel','off'). The results of running the code again are rather interesting:

timings1Dnoacceleration

It is shown that now Method 1 is optimal, both for row and column vectors. And method 3 behaves like the other methods in the first test.

Conclusion

Optimizing memory pre-allocation is useless and a waste of time since MatLab will optimize for you anyway.

Note that memory should be pre-allocated but the way in which you do it doesn't matter. The performance of pre-allocating memory is largely dependent on whether or not the JIT compiler of MatLab chooses to optimize your code or not. This is fully dependent on all other content of your .m-file since the compiler considers chunks of codes at the time and then tries to optimize (it even has a memory so that running a file several times might result in an even lower execution-time). Also memory pre-allocation is most often a very short process considering performance compared to the calculation performed afterwards

In my opinion memory should be pre-allocated by either using method 1 or method 2 to maintain a readable code and use the function that MatLab help suggests since these are the most likely to be improved in the future.

Code used

clear all
clc
feature('accel','on')

number1D=30;

nn1D=2.^(1:number1D);

timings1D=zeros(5,number1D);

for ii=1:length(nn1D);
    n=nn1D(ii);
    % 1D
    tic
    a = zeros(1,n);
    a(randi(n,1))=1;
    timings1D(1,ii)=toc;
    fprintf('1D row vector method1 took: %f\n',timings1D(1,ii))
    clear a

    tic
    b = zeros(n,1);
    b(randi(n,1))=1;
    timings1D(2,ii)=toc;
    fprintf('1D column vector method1 took: %f\n',timings1D(2,ii))
    clear b

    tic
    c = NaN(1,n);
    c(randi(n,1))=1;
    timings1D(3,ii)=toc;
    fprintf('1D row vector method2 took: %f\n',timings1D(3,ii))
    clear c

    tic
    d = NaN(n,1);
    d(randi(n,1))=1;
    timings1D(4,ii)=toc;
    fprintf('1D row vector method2 took: %f\n',timings1D(4,ii))
    clear d

    tic
    e(n) = 0;
    e(randi(n,1))=1;
    timings1D(5,ii)=toc;
    fprintf('1D row vector method3 took: %f\n',timings1D(5,ii))
    clear e
end
logtimings1D = log10(timings1D);
lognn1D=log10(nn1D);
figure(1)
clf()
hold on
plot(lognn1D,logtimings1D(1,:),'-k','LineWidth',2)
plot(lognn1D,logtimings1D(2,:),'--k','LineWidth',2)
plot(lognn1D,logtimings1D(3,:),'-.k','LineWidth',2)
plot(lognn1D,logtimings1D(4,:),'-','Color',[0.6 0.6 0.6],'LineWidth',2)
plot(lognn1D,logtimings1D(5,:),'--','Color',[0.6 0.6 0.6],'LineWidth',2)
xlabel('Number of elements (log10[-])')
ylabel('Timing of each method (log10[s])')
legend('M1R','M1C','M2R','M2C','M3','Location','NW')
title({'Various methods of pre-allocation in 1D','nr. of elements vs timing'})
hold off

Note

The lines containing c(randi(n,1))=1; do not do anything except assigning the value one to a random element in the pre-allocated array so that the array is used to challenge the JIT compiler a bit. These lines are not affecting the pre-allocation measurement significantly, i.e. they are not measurable and do not effect the test.

EJG89
  • 1,189
  • 7
  • 17
  • Nice test, but `a(randi(n,1))=1;` only initializes one random element of the vector does it? Sounds quite arbitrary to me. I'd rather initialize every element with something like `2*i+1` to prevent some possible optimizations. – Mathias Jul 31 '14 at 08:55
  • That line is meant to assign 1 to a random element so that the JIT optimization does not detect that nothing is done with the arrays. It does not initialize anything but only assigns one to a pre-allocated array of zeros or nans. – EJG89 Jul 31 '14 at 09:01
  • I added a note explaining the purpose of these lines. – EJG89 Jul 31 '14 at 09:03
  • What if matlab internally generates a sparse matrix for M3 and only allocates full if enough values are written? Setting one random value doesn't prevent that. – Mathias Jul 31 '14 at 09:03
  • MatLab does not generate a sparse matrix, since I also mapped the memory and I see it is pre-allocated (i.e. reserved) and increasing memory. The random value is more like a flag: I do use this array do not leave it out completely. – EJG89 Jul 31 '14 at 09:10
  • In that case, it sounds sufficient to me. Thanks for the explanation and the good answer. – Mathias Jul 31 '14 at 09:13
1

How about letting Matlab take care of the allocation for you?

clear a;
for i=N:-1:1
    a(i) = cos(i);
end

Then Matlab can allocate and fill the array with whatever it thinks will be optimal (probably zero). You don't have the debugging advantage of NaNs, though.

  • Have you timed this? I would assume it's the same as method (3) – Patrick Sanan Jul 30 '14 at 16:12
  • 2
    I haven't timed it, but I also expect so (for most cases). The only possible benefit is that you are guaranteed to get a variable of the correct class -- assigning `a(n) = 0` may not, and may then require re-allocation. – Dylan Richard Muir Aug 06 '14 at 18:49