24

I have a big matrix A which is 1GB of double values, when I reshape it to different dimensions, it's incredible fast.

A=rand(128,1024,1024);
tic;B=reshape(A,1024,128,1024);toc

Elapsed time is 0.000011 seconds.

How can it be that fast? Another observation, MATLAB uses less memory than it should after running that code and storing two matrices of 1GB each: Memory used by MATLAB: 1878 MB (1.969e+09 bytes)

Daniel
  • 36,610
  • 3
  • 36
  • 69

1 Answers1

41

Explanation of the good performance

Matlab uses copy-on-write whenever possible. If you write expressions like B=A, MATLAB does not copy A, instead both variables A and B are references to the same data structure. Only if one of the two variables will be modified, MATLAB will create a copy.

Now to the special case of reshape. Here it looks like A and B are not the same, but in memory they are. The underlying array which holds the data is unaffected by the reshape operation, nothing has to be moved: all(A(:)==B(:)). Everything MATLAB has to do when calling reshape is to create a new reference and annotate it with the new dimensions of the matrix. Reshaping a matrix is nothing more than creating a new reference to the input data, which is annotated with the new dimensions. The runtime of reshape is less than 1µs or roughly the time two simple assignments like B=A require. For all practical applications a zero time operation.

>> tic;for i=1:1000;B=reshape(A,1024,128,1024);end;toc
Elapsed time is 0.000724 seconds.
>> tic;for i=1:1000;B=A;end;toc
Elapsed time is 0.000307 seconds.

It is unknown how large such a reference really is, but we can assume it to be within a few bytes.

Other zero cost operations

Functions known to have practically zero cost (both runtime and memory):

  • B=reshape(A,sz)
  • B=A(:)
  • B=A.' - only for Vectors
  • B=A' - only for Vectors of real numbers, without the attribute complex. Use .' instead.
  • B=permute(A,p) - only for the cases where all(A(:)==B(:)).1
  • B=ipermute(A,p) - only for the cases where all(A(:)==B(:)).1
  • B=squeeze(A) 1
  • shiftdim - only for the cases where all(A(:)==B(:)), which are:1
    • used to remove leading singleton dimensions.
    • used with negative second input
    • used without second input argument.

Functions which are "expensive", regardless of the fact that they don't touch the representation in memory (all(A(:)==B(:)) is true)

  • Left sided indexing: B(1:numel(A))=A; 2
  • Right sided indexing other than (:), including B=A(1:end); and B=A(:,:,:); 2

1 Significantly slower runtime than reshape between 1µs and 1ms. Probably because of some constant computation overhead. Memory consumption is practically zero and the runtime is independent from the input size. Operations without this annotation have a runtime below 1µs and roughly equivalent to reshape.

2 Zero cost in OCTAVE

Originally used MATLAB 2013b when writing this post. Confirmed the numbers with MATLAB 2019b.

Daniel
  • 36,610
  • 3
  • 36
  • 69
  • Hi very nice answer! Left-right side indexing is one the most powerful features of MATLAB/OCTAVE, why is it "expensive"? Taking more time than necessary ? [I'm not referring to whole data-copy such as `B(1:numel(A)) = A` which can cost zero-time with B=A, (or variants) as you have explained..] Then how can I reduce the cost of say `B(1:lenx:end, j) = A(i, 1:leny:end).'`? Do you know a (lot) faster method ? – Fat32 Dec 10 '20 at 16:49
  • I don't think you are correct about the zero cost of permute. If you use a true memory profiler with a large array that you permute, you will see that permute actually makes a copy first. In my case, I have a 6GB array being permuted which increases memory usage to 12GB (for just this array) briefly. This is a huge issue for large arrays compared to reshape. – alwaysmvp45 Jul 07 '22 at 20:55
  • @alwaysmvp45: I would be surprised if today's MATLAB-Versions are worse than what I observed in 2013b, but could be. What size of matrix do you use and what is your permute operation? Did you verify that `all(A(:)==B(:))` is in fact true? – Daniel Jul 08 '22 at 21:15
  • I guess I had misunderstood what was meant there but now I see how exclusive that condition is. In what case could you permute a matrix (or N-dimensional array) such that that condition holds? You essentially need extremely high amounts of repetition in extremely structured settings or extremely simplified examples like identity matrices with transposition, in which case you gain nothing by permutation. I can't think of a non-crafted case where you could fit this condition and gain anything from permuting. – alwaysmvp45 Jul 10 '22 at 18:26
  • Why is it not like python's implementation where transpose is only a view of the data? Or even if it isn't the default, it should be an option considering how useful a memory-efficient transposition is. – alwaysmvp45 Jul 10 '22 at 18:30
  • If you want similar stuff in python, use numpy. Check the documentation for strides. – Daniel Sep 02 '22 at 10:08