The link that @mgillson found appears to address the question of 'how do I tell if it made a copy', but not 'how do I predict it' or understand why it made the copy. As for the test, I like to use A.__array_interfrace__
.
Most likely this would be a problem if you tried to assign values to the reshaped array, expecting to also change the original. And I'd be hard pressed to find a SO case where that was the issue.
A copying reshape will be a bit slower than a noncopying one, but again I can't think of a case where that produced a slow down of the whole code. A copy could also be an issue if you are working with arrays so big that the simplest operation produces a memory error.
After reshaping the values in the data buffer need to be in a contiguous order, either 'C' or 'F'. For example:
In [403]: np.arange(12).reshape(3,4,order='C')
Out[403]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [404]: np.arange(12).reshape(3,4,order='F')
Out[404]:
array([[ 0, 3, 6, 9],
[ 1, 4, 7, 10],
[ 2, 5, 8, 11]])
It will do a copy if the initial order is so 'messed up' that it can't return values like this. Reshape after transpose may do this (see my example below). So might games with stride_tricks.as_strided
. Off hand those are the only cases I can think of.
In [405]: x=np.arange(12).reshape(3,4,order='C')
In [406]: y=x.T
In [407]: x.__array_interface__
Out[407]:
{'version': 3,
'descr': [('', '<i4')],
'strides': None,
'typestr': '<i4',
'shape': (3, 4),
'data': (175066576, False)}
In [408]: y.__array_interface__
Out[408]:
{'version': 3,
'descr': [('', '<i4')],
'strides': (4, 16),
'typestr': '<i4',
'shape': (4, 3),
'data': (175066576, False)}
y
, the transpose, has the same 'data' pointer. The transpose was performed without changing or copying the data, it just created a new object with new shape
, strides
, and flags
.
In [409]: y.flags
Out[409]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
...
In [410]: x.flags
Out[410]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
...
y
is order 'F'. Now try reshaping it
In [411]: y.shape
Out[411]: (4, 3)
In [412]: z=y.reshape(3,4)
In [413]: z.__array_interface__
Out[413]:
{...
'shape': (3, 4),
'data': (176079064, False)}
In [414]: z
Out[414]:
array([[ 0, 4, 8, 1],
[ 5, 9, 2, 6],
[10, 3, 7, 11]])
z
is a copy, its data
buffer pointer is different. Its values are not arranged in any way that resembles that of x
or y
, no 0,1,2,...
.
But simply reshaping x
does not produce a copy:
In [416]: w=x.reshape(4,3)
In [417]: w
Out[417]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
In [418]: w.__array_interface__
Out[418]:
{...
'shape': (4, 3),
'data': (175066576, False)}
Raveling y
is the same as y.reshape(-1)
; it produces as copy:
In [425]: y.reshape(-1)
Out[425]: array([ 0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11])
In [426]: y.ravel().__array_interface__['data']
Out[426]: (175352024, False)
Assigning values to a raveled array like this may be the most likely case where a copy will produce an error. For example, x.ravel()[::2]=99
changes every other value of x
and y
(columns and rows respectively). But y.ravel()[::2]=0
does nothing because of this copying.
So reshape after transpose is the most likely copy scenario. I'd be happy explore other possibilities.
edit: y.reshape(-1,order='F')[::2]=0
does change the values of y
. With a compatible order, reshape does not produce a copy.
One answer in @mgillson's link, https://stackoverflow.com/a/14271298/901925, points out that the A.shape=...
syntax prevents copying. If it can't change the shape without copying it will raise an error:
In [441]: y.shape=(3,4)
...
AttributeError: incompatible shape for a non-contiguous array
This is also mentioned in the reshape
documentation
If you want an error to be raise if the data is copied,
you should assign the new shape to the shape attribute of the array::
SO question about reshape following as_strided
:
reshaping a view of a n-dimensional array without using reshape
and
Numpy View Reshape Without Copy (2d Moving/Sliding Window, Strides, Masked Memory Structures)
==========================
Here's my first cut at translating shape.c/_attempt_nocopy_reshape
into Python. It can be run with something like:
newstrides = attempt_reshape(numpy.zeros((3,4)), (4,3), False)
import numpy # there's an np variable in the code
def attempt_reshape(self, newdims, is_f_order):
newnd = len(newdims)
newstrides = numpy.zeros(newnd+1).tolist() # +1 is a fudge
self = numpy.squeeze(self)
olddims = self.shape
oldnd = self.ndim
oldstrides = self.strides
#/* oi to oj and ni to nj give the axis ranges currently worked with */
oi,oj = 0,1
ni,nj = 0,1
while (ni < newnd) and (oi < oldnd):
print(oi, ni)
np = newdims[ni];
op = olddims[oi];
while (np != op):
if (np < op):
# /* Misses trailing 1s, these are handled later */
np *= newdims[nj];
nj += 1
else:
op *= olddims[oj];
oj += 1
print(ni,oi,np,op,nj,oj)
#/* Check whether the original axes can be combined */
for ok in range(oi, oj-1):
if (is_f_order) :
if (oldstrides[ok+1] != olddims[ok]*oldstrides[ok]):
# /* not contiguous enough */
return 0;
else:
#/* C order */
if (oldstrides[ok] != olddims[ok+1]*oldstrides[ok+1]) :
#/* not contiguous enough */
return 0;
# /* Calculate new strides for all axes currently worked with */
if (is_f_order) :
newstrides[ni] = oldstrides[oi];
for nk in range(ni+1,nj):
newstrides[nk] = newstrides[nk - 1]*newdims[nk - 1];
else:
#/* C order */
newstrides[nj - 1] = oldstrides[oj - 1];
#for (nk = nj - 1; nk > ni; nk--) {
for nk in range(nj-1, ni, -1):
newstrides[nk - 1] = newstrides[nk]*newdims[nk];
nj += 1; ni = nj
oj += 1; oi = oj
print(olddims, newdims)
print(oldstrides, newstrides)
# * Set strides corresponding to trailing 1s of the new shape.
if (ni >= 1) :
print(newstrides, ni)
last_stride = newstrides[ni - 1];
else :
last_stride = self.itemsize # PyArray_ITEMSIZE(self);
if (is_f_order) :
last_stride *= newdims[ni - 1];
for nk in range(ni, newnd):
newstrides[nk] = last_stride;
return newstrides