0

I'm trying to understand when, after a reshape, numpy made a copy or a view. I was trying it analyzing the content of the base attribute. I expected it to be None when the array is a copy, the original array if it is a view. However, with the following code:

A = numpy.array([[1,2,20],[3,4,40],[5,6,60],[7,8,80],[9,10,100]])
print('A:\n',A)
print('A base:\n', A.base)
print('A initial shape:', A.shape)
B = A.reshape(3,5)
print('B:\n', B)
print('B base:\n', B.base)


C = A[1:3,0:2]
print('C:\n', C)
print('C base:\n', C.base)

D = C.reshape(4,1)
print('D:\n', D)
print('D base:\n', D.base)

I have the following output:

A:
 [[  1   2  20]
 [  3   4  40]
 [  5   6  60]
 [  7   8  80]
 [  9  10 100]]
A base:
 None
A initial shape: (5, 3)
B:
 [[  1   2  20   3   4]
 [ 40   5   6  60   7]
 [  8  80   9  10 100]]
B base:
 [[  1   2  20]
 [  3   4  40]
 [  5   6  60]
 [  7   8  80]
 [  9  10 100]]
C:
 [[3 4]
 [5 6]]
C base:
 [[  1   2  20]
 [  3   4  40]
 [  5   6  60]
 [  7   8  80]
 [  9  10 100]]
D:
 [[3]
 [4]
 [5]
 [6]]
D base:
 [[3 4]
 [5 6]]

I agree thatA is raw array having base attribute to None, B and C are views of A, so the base attribute points to the original A array. However, I don't undestand the base attribute of D. I expected it is not a view but a new array, but the base attribute point to a matrix [[3 4][5 6]] (that is not C, since C is a view of A, as shown in its base attribute) instead of None. Why this? C is a view of a new array never defined? Why C is not simply the desired [[3] [4] [5] [6]] array with None in base ?

volperossa
  • 1,339
  • 20
  • 33
  • [View of a view of a numpy array is a copy?](https://stackoverflow.com/questions/38768815/view-of-a-view-of-a-numpy-array-is-a-copy) – keepAlive Aug 30 '22 at 10:17
  • It does not seem the same question – volperossa Aug 30 '22 at 10:55
  • Does this answer your question? [When will numpy copy the array when using reshape()](https://stackoverflow.com/questions/36995289/when-will-numpy-copy-the-array-when-using-reshape) – Mechanic Pig Aug 30 '22 at 16:14

1 Answers1

1

For B, the base is the A array; the id's match:

In [111]: id(A)
Out[111]: 2579202242096

In [112]: id(B.base)
Out[112]: 2579202242096

For D, the base is a copy of C, same values but different id:

In [113]: id(C)
Out[113]: 2579189968112

In [114]: id(D.base)
Out[114]: 2579204460048

I like to use __array_interface__ to check the memory buffer of arrays:

In [115]: A.__array_interface__
Out[115]: 
{'data': (2581261858000, False),
 'strides': None,
 'descr': [('', '<i4')],
 'typestr': '<i4',
 'shape': (5, 3),
 'version': 3}

In [116]: B.__array_interface__['data']   # same as As
Out[116]: (2581261858000, False)

In [117]: C.__array_interface__['data']   # 12 bytes further in
Out[117]: (2581261858012, False)

In [118]: D.__array_interface__['data']   # totally different
Out[118]: (2581264982160, False)

Looking at strides as well as shape may help

The inner loop, across the columns of a row, steps by 4 bytes, the size of the int32 (you might be 8 bytes); going from row to row requires a step of 3*4=12:

In [123]: A.shape, A.strides
Out[123]: ((5, 3), (12, 4))

In [124]: B.shape, B.strides     # 20 is 5*4
Out[124]: ((3, 5), (20, 4))

In [125]: C.shape, C.strides      # same as for A, but just 2 columns
Out[125]: ((2, 2), (12, 4))

In [126]: D.shape, D.strides      # from row to row is 4 bytes
Out[126]: ((4, 1), (4, 4))

Look at the values ravelled:

In [132]: A.ravel()
Out[132]: 
array([  1,   2,  20,   3,   4,  40,   5,   6,  60,   7,   8,  80,   9,
        10, 100])

In [133]: C.ravel()
Out[133]: array([3, 4, 5, 6])

While it's possible to select C for A using 2d slicing, we can't do the same with a raveled array. [3, 4, _, 5, 6]. The selection is 2, skip 1, 2. That's not a regular pattern. A view is possible only when the selection can be expressed as regular 1d slice (start,stop,step).

reshape says it can't always return a view. Reshape after transpose is one well known case of this. This is the other, a reshape after a subsetting slice.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • ok, D is a copy of C, but why not an independent array, i.e. why D base is not None? – volperossa Aug 30 '22 at 16:49
  • 1
    I don't know the details, but it looks like the `reshape` makes a copy of `C`, and then applies the reshape. Hence the base is that copy. In effect it is `C.copy().reshape(...)`. – hpaulj Aug 30 '22 at 17:30