1

Consider the following code:

>>x=np.array([1,3]).reshape(2,1)
array([[1],
   [3]])   
>>M=np.array([[1,2],[3,4]])
array([[1, 2],
   [3, 4]])
>>y=M[:,0]
>>x-y
array([[ 0,  2],
   [-2,  0]])

I would intuitively feel this should give a (2,1) vector of zeros.

I am not saying, however, that this is how it should be done and everything else is stupid. I would simply love if someone could offer some logic that I can remember so things like this don't keep producing bugs in my code.

Note that I am not asking how I can achieve what I want (I could reshape y), but I am hoping to get some deeper understanding of why Python/Numpy works as it does. Maybe I am doing something conceptually wrong?

Bananach
  • 2,016
  • 26
  • 51
  • 1
    Look at [this](http://stackoverflow.com/questions/3551242/numpy-index-slice-without-losing-dimension-information) post, using `y = M[:, 0, None]` will give you the desired result (shown in [this answer](http://stackoverflow.com/a/3551859/6779606)). Alternatively, `y = M[:, [0]]` also works (as per [this answer](http://stackoverflow.com/a/18183182/6779606)). So to answer your question, it seems like dimension information is lost at some point when slicing, but I don't know why NumPy developers decided to do that. – Stephen B Sep 26 '16 at 15:13

2 Answers2

1

Look at the shape of y. It is (2,); 1d. The source array is (2,2), but you are selecting one column. M[:,0] not only selects the column, but removes that singleton dimension.

So we have for the 2 operations, this change in shape:

M[:,0]: (2,2) => (2,)
x - y: (2,1) (2,) => (2,1), (1,2) => (2,2)

There are various ways of ensuring that y has the shape (2,1). Index with a list/vector, M[:,[0]]; index with a slice, M[:,:1]. Add a dimension, M[:,0,None].

Think also what happens when M[0,:] or M[0,0].

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • There is one important part that I do not understand: Why is an array of size (2,) converted to an array of size (1,2) (in the subtraction of x and y)? – Bananach Sep 29 '16 at 07:52
  • To match the number of dimensions of `x`. It's part of the `broadcasting` rules. – hpaulj Sep 29 '16 at 10:08
  • But why isn't it converted to (2,1)? Converting it to (1,2) seems just as arbitrary as converting it to (1,1,1,1,1,1,1,2) – Bananach Sep 29 '16 at 10:41
  • That way it is predictable. It also fits better with the default `C` order. Matlab is `F` order and expands dimensions on the right.. – hpaulj Sep 29 '16 at 13:05
  • right. rebuttal to myself: its not that "(2,) gets converted to (1,2)" but "(2,) when used together in an operation with (2,1) gets converted to (1,2) by aligning the dimensions tail first" – Bananach Sep 13 '17 at 17:47
1

numpy.array indexes such that a single value in any position collapses that dimension, while slicing retains it, even if the slice is only one element wide. This is completely consistent, for any number of dimensions:

>> A = numpy.arange(27).reshape(3, 3, 3)
>> A[0, 0, 0].shape
()

>> A[:, 0, 0].shape
(3,)

>> A[:, :, 0].shape
(3, 3)

>> A[:1, :1, :1].shape
(1, 1, 1)

Notice that every time a single number is used, that dimension is dropped.

You can obtain the semantics you expect by using numpy.matrix, where two single indexes return a order 0 array and all other types of indexing return matrices

>> M = numpy.asmatrix(numpy.arange(9).reshape(3, 3))

>> M[0, 0].shape
()

>> M[:, 0].shape   # This is different from the array
(3, 1)

>> M[:1, :1].shape
(1, 1)

Your example works as you expect when you use numpy.matrix:

>> x = numpy.matrix([[1],[3]])
>> M = numpy.matrix([[1,2],[3,4]])
>> y = M[:, 0]
>> x - y
matrix([[0],
        [0]])
chthonicdaemon
  • 19,180
  • 2
  • 52
  • 66