I don't see evidence of much difference. You could do a time test on very large arrays. Basically both fiddle with the shape, and possibly the strides. __array_interface__
is a nice way of accessing this information. For example:
In [94]: b.__array_interface__
Out[94]:
{'data': (162400368, False),
'descr': [('', '<f8')],
'shape': (5,),
'strides': None,
'typestr': '<f8',
'version': 3}
In [95]: b[None,:].__array_interface__
Out[95]:
{'data': (162400368, False),
'descr': [('', '<f8')],
'shape': (1, 5),
'strides': (0, 8),
'typestr': '<f8',
'version': 3}
In [96]: b.reshape(1,5).__array_interface__
Out[96]:
{'data': (162400368, False),
'descr': [('', '<f8')],
'shape': (1, 5),
'strides': None,
'typestr': '<f8',
'version': 3}
Both create a view, using the same data
buffer as the original. Same shape, but reshape doesn't change the strides
. reshape
lets you specify the order
.
And .flags
shows differences in the C_CONTIGUOUS
flag.
reshape
may be faster because it is making fewer changes. But either way the operation shouldn't affect the time of larger calculations much.
e.g. for large b
In [123]: timeit np.outer(b.reshape(1,-1),b)
1 loops, best of 3: 288 ms per loop
In [124]: timeit np.outer(b[None,:],b)
1 loops, best of 3: 287 ms per loop
Interesting observation that: b.reshape(1,4).strides -> (32, 8)
Here's my guess. .__array_interface__
is displaying an underlying attribute, and .strides
is more like a property (though it may all be buried in C code). The default underlying value is None
, and when needed for calculation (or display with .strides
) it calculates it from the shape and item size. 32
is the distance to the end of the 1st row (4x8). np.ones((2,4)).strides
has the same (32,8)
(and None
in __array_interface__
.
b[None,:]
on the other hand is preparing the array for broadcasting. When broadcasted, existing values are used repeatedly. That's what the 0
in (0,8)
does.
In [147]: b1=np.broadcast_arrays(b,np.zeros((2,1)))[0]
In [148]: b1.shape
Out[148]: (2, 5000)
In [149]: b1.strides
Out[149]: (0, 8)
In [150]: b1.__array_interface__
Out[150]:
{'data': (3023336880L, False),
'descr': [('', '<f8')],
'shape': (2, 5),
'strides': (0, 8),
'typestr': '<f8',
'version': 3}
b1
displays the same as np.ones((2,5))
but has only 5 items.
np.broadcast_arrays
is a function in /numpy/lib/stride_tricks.py
. It uses as_strided
from the same file. These functions directly play with the shape and strides attributes.