6

I am reading source code of an open source project recently. When the programmer wanted to convert a row vector like array([0, 1, 2]) to a column vector like array([[0], [1], [2]]), np.reshape(x, (-1,1)) was used. In the comment, it says reshape is necessary to preserve the data contiguity against vs [:, np.newaxis] that does not.

I tried the two ways, it seems like they will return the same results. Then what does the data contiguity preservation mean here?

CT Zhu
  • 52,648
  • 17
  • 120
  • 133
HannaMao
  • 183
  • 1
  • 2
  • 9
  • Some part of this story is getting lost in the retelling. (For example, the parts about a "row vector" seem incorrect; an actual 1-by-whatever row vector would respond differently to these operations.) – user2357112 Sep 21 '17 at 01:35

1 Answers1

10

Both ways return views of the exact same data, therefore the 'data contiguity' is likely a non-issue as the data is not change, only the view is changed. See Numpy: use reshape or newaxis to add dimensions.

However there might be a practical advantage of using .reshape((-1,1)), as it will reshape the array into 2d array regardless of the original shape. For [:, np.newaxis], the result will depend on the original shape of the array, considering these:

In [3]: a1 = np.array([0, 1, 2])

In [4]: a2 = np.array([[0, 1, 2]])

In [5]: a1.reshape((-1, 1))
Out[5]: 
array([[0],
       [1],
       [2]])

In [6]: a2.reshape((-1, 1))
Out[6]: 
array([[0],
       [1],
       [2]])

In [7]: a1[:, np.newaxis]
Out[7]: 
array([[0],
       [1],
       [2]])

In [8]: a2[:, np.newaxis]
Out[8]: array([[[0, 1, 2]]])
CT Zhu
  • 52,648
  • 17
  • 120
  • 133