3

I am reading about numpy's reshape method. Here is the quote:

Note that for this (i.e. reshaping) to work, the size of the initial array must match the size of the reshaped array. Where possible, the reshape method will use a no-copy view of the initial array, but with noncontiguous memory buffers this is not always the case.

I did a few simple tests and it seems reshape indeed does not create a copy, the memory is shared.

So what does that part here mean "but with noncontiguous memory buffers this is not always the case"? What is an example where reshape does create a copy of the data? And what are the rules really i.e. when exactly does it create a copy and when not?

peter.petrov
  • 38,363
  • 16
  • 94
  • 159
  • Does this answer your question? [When will numpy copy the array when using reshape()](https://stackoverflow.com/questions/36995289/when-will-numpy-copy-the-array-when-using-reshape) – Jan Christoph Terasa Oct 05 '21 at 08:54
  • @JanChristophTerasa Seems that question/answer does talk about the same thing but I don't really understand it. What are they talking about there? I need some simple explanation when is it doing a copy and when not. I will reread that answer anyway. – peter.petrov Oct 05 '21 at 09:02
  • What is this order C and order F? And then it says "It will do a copy if the initial order is so 'messed up' that it can't return values like this." What does that mean?! Sounds totally informal to me. – peter.petrov Oct 05 '21 at 09:04
  • C order and F(ortran) order refers to [row-major and column-major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order) of multidimensional arrays. Contiguous means that you can the array is stored as one "block" in memory. You use special indexing to extract non-contiguous values from an array (e.g. transposition). See also: https://stackoverflow.com/questions/26998223/what-is-the-difference-between-contiguous-and-non-contiguous-arrays – Jan Christoph Terasa Oct 05 '21 at 09:17

2 Answers2

3

Example of a view vs copy

Here is an example of a copy being created for a reshape operation. We can check if two arrays share memory or not with np.share_memory. If True then one of them is a view of the other and if False then one of them is a copy of the other and is stored in a separate memory. Meaning, any operations on one don't reflect on the other.

a = np.array([[1,2,3,4],[5,6,7,8]])
b = a.T

arr1 = a.reshape((-1,1))
print('Reshape of original is a view:', np.shares_memory(a, arr1))

print('Transpose sharing memory:', np.shares_memory(a,b))

arr2 = b.reshape((-1,1))
print('Reshape of transpose is a view:', np.shares_memory(b, arr2))
Reshape of original is a view: True    #<- a, a.reshape share memory
Transpose sharing memory: True         #<- a, a.T share memory
Reshape of transpose is a view: False  #<- a.T, a.T.reshape DONT share memory

EXPLANATION:

How numpy stores arrays?

Numpy stores its ndarrays as contiguous blocks of memory. Each element is stored in a sequential manner every n bytes after the previous.

(images referenced from this excellent SO post)

So if your 3D array looks like this -

np.arange(0,16).reshape(2,2,4)

#array([[[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7]],
#
#       [[ 8,  9, 10, 11],
#        [12, 13, 14, 15]]])

enter image description here

Then in memory its stores as -

enter image description here

When retrieving an element (or a block of elements), NumPy calculates how many strides (of 8 bytes each) it needs to traverse to get the next element in that direction/axis. So, for the above example, for axis=2 it has to traverse 8 bytes (depending on the datatype) but for axis=1 it has to traverse 8*4 bytes, and axis=0 it needs 8*8 bytes.

Almost all numpy operations depend on this nature of storage of the arrays. So, to work with an array that can comprise of non-contiguous blocks of memory, numpy is sometimes forced to create copies instead of views.

Why reshape may create a copy sometimes?

Coming back to the example that I show above with array a and a.T, let's look at the first example. We have an array a which is stored as a contiguous block of memory as below -

enter image description here

Since an array needs to be stored in a contiguous manner so that other numpy operations can be properly applied, it is forced to create a copy of the numpy array since its really difficult to keep a track of the memory associated with the original elements for subsequent operations. This is why the a.T gets a reshape output as a copy in this case.

Hopefully, this should explain your query. I am not that great at articulating so do let me know what part is confusing to you and I can edit my answer for a clearer explanation.

peter.petrov
  • 38,363
  • 16
  • 94
  • 159
Akshay Sehgal
  • 18,741
  • 3
  • 21
  • 51
0

I was going to repeat the core of my linked answer, and say that transpose is the most likely case where reshape will produce a copy.

But then I thought of a case where the view is produces a non-contiguous selection:

In [181]: x = np.arange(16).reshape(4,4)
In [182]: x1 = x[::2,::2]
In [183]: x
Out[183]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
In [184]: x1
Out[184]: 
array([[ 0,  2],
       [ 8, 10]])
In [185]: x1[0,0]=100
In [186]: x
Out[186]: 
array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15]])

x1, created with basic indexing, is a view. But a reshape of it is a copy:

In [187]: x2 = x1.reshape(-1)
In [188]: x2
Out[188]: array([100,   2,   8,  10])
In [189]: x2[0] = 200
In [190]: x1
Out[190]: 
array([[100,   2],
       [  8,  10]])
In [191]: x
Out[191]: 
array([[100,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15]])

A simple test for a copy is to look at the ravel. ravel is a reshape; if the ravel isn't the same as the originals, or a subset of it, it's a copy.

In [192]: x.ravel()
Out[192]: 
array([100,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15])
In [193]: x1.ravel()
Out[193]: array([100,   2,   8,  10])    # x1 is a view, but a reshape is a copy
In [194]: xt = x.T
In [195]: xt
Out[195]: 
array([[100,   4,   8,  12],
       [  1,   5,   9,  13],
       [  2,   6,  10,  14],
       [  3,   7,  11,  15]])
In [196]: xt.ravel()   # xt is a view, but its reshape is a copy
Out[196]: 
array([100,   4,   8,  12,   1,   5,   9,  13,   2,   6,  10,  14,   3,
         7,  11,  15])

selecting two rows is a view, and its reshape is one too:

In [197]: x[:2].ravel()
Out[197]: array([100,   1,   2,   3,   4,   5,   6,   7])
In [198]: x[:2].ravel()[0]=200
In [199]: x
Out[199]: 
array([[200,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15]])

but not when selecting two columns:

In [200]: x[:,:2].ravel()
Out[200]: array([200,   1,   4,   5,   8,   9,  12,  13])
In [201]: x[:,:2].ravel()[0]=150
In [202]: x
Out[202]: 
array([[200,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15]])

As for predicting a copy without actually doing it - I'd depend more on experience than some code (which I attempted in the linked answer). I have a good idea of how the data is layed out, or can easily test for it.

Keep in mind when this copy/no-copy is important. As I showed, the copy case keeps us from modifying the original array. Whether that's good or not depends on the situation. In other cases we don't really care. The performance cost of a copy isn't that great.

hpaulj
  • 221,503
  • 14
  • 230
  • 353