132

I'm using numpy and want to index a row without losing the dimension information.

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10,:]
xslice.shape   # >> (10,)  

In this example xslice is now 1 dimension, but I want it to be (1,10). In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn't find it in the documentation and didn't see a similar question asked.

Thanks!

Joe Kington
  • 275,208
  • 71
  • 604
  • 463
mindmatters
  • 2,455
  • 3
  • 18
  • 10

7 Answers7

123

Another solution is to do

X[[10],:]

or

I = array([10])
X[I,:]

The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.

gnebehay
  • 1,485
  • 1
  • 10
  • 12
  • 4
    This copies the array data – Per Mar 17 '14 at 12:46
  • 1
    This is not always the case. See: `x = np.array([[1,2,3,4]])` if you then slice it with `x[[0],[1,2]]` you get the one dimensional `array([2, 3])` My opinion is when selecting column or row vectors it's best to make the slice simple and then use `np.reshape`, So in my example it would be `np.reshape(x[0,[1,2]],[1,2])` – Alexander Sep 14 '15 at 01:43
  • 1
    others, be aware of a semicolon in the end - it is important, `X[[10]]` would be interpreted as `X[10]`and shape will be smaller; similarly, `X[[10, 20]] == X[10, 20]` and shape is even smaller – Ben Usman Jun 15 '18 at 00:34
  • 6
    **Warning**: do not mix this way of indexing with just integer indexing! If you had `a` of shape `(10, 20, 30)`, then `a[0, :, [0]]` will have shape `(1, 20)`, not `(20, 1)`, because in the latter indexes are broadcasted to `a[[0], :, [0]]` which is often not quite what you expect! Whereas `a[0, :, :1]` will give you `(20, 1)` as expected. Moreover, see the above comment for weird edge case with single index. Overall, it seems like this method has too many edge cases. – Ben Usman Jul 26 '18 at 19:14
69

It's probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :]. None or np.newaxis increases the dimension of the array by 1, so that you're back to the original after the slicing eliminates a dimension.

As far as why it's not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I'd guess the numpy devs felt the same way.

Also, numpy handle broadcasting arrays very well, so there's usually little reason to retain the dimension of the array the slice came from. If you did, then things like:

a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b

either wouldn't work or would be much more difficult to implement.

(Or at least that's my guess at the numpy dev's reasoning behind dropping dimension info when slicing)

Princy
  • 333
  • 3
  • 11
Joe Kington
  • 275,208
  • 71
  • 604
  • 463
  • 7
    @Lisa: `x[None, 10]` will do what you want. – naught101 Jun 17 '16 at 01:02
  • Yup. Put your `None`s next to the dims you are chopping. – Mad Physicist Jul 14 '16 at 19:36
  • 1
    The example is missing extra brackets for the tuple in the assignment to `b`; it should be `b = np.zeros((100,10))`. – Jerzy Mar 17 '17 at 20:18
  • What is the reason for using 3 indices in total instead of just two? I mean `X[10,None]` (using your code as example). – greenoldman Sep 02 '17 at 07:57
  • 18
    "*there's usually little reason to retain the dimension of the array*" ... Well it'll certainly, utterly, and completely screw up matrix multiplication ([`np.matmul()` or `@`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html)). Just got burned by this. – Jean-François Corbett Mar 27 '19 at 12:02
  • Agreed. So for `matmul` I should probably always pout `None` there. There is no other way around, right? – Dr_Zaszuś Apr 10 '19 at 13:39
  • This is also the way to go in TensorFlow (using `None` or `tf.newaxis`). @gnebehay's solution does not work there. – bers Jan 10 '20 at 12:46
  • Interestingly... when using `numpy.delete()` to remove data, the singleton dimensions are preserved in the new array copy. Are there any intuitive rules as to what situations numpy does/doesn't drop singleton dimensions? Or is this dimension dropping only seen in simple indexing? – RTbecard May 19 '20 at 12:20
  • @RTbecard, squeezing out singleton dimensions is not automatic. There is a separate 'np.squeeze` function if you want to do that. The dimension dropping that's being discussed here is unique to indexing. – hpaulj Dec 16 '21 at 17:56
32

I found a few reasonable solutions.

1) use numpy.take(X,[10],0)

2) use this strange indexing X[10:11:, :]

Ideally, this should be the default. I never understood why dimensions are ever dropped. But that's a discussion for numpy...

mindmatters
  • 2,455
  • 3
  • 18
  • 10
  • 4
    'dimensions' are dropped when indexing Python lists, `alist[0]` and kept when slicing them. – hpaulj May 13 '18 at 23:04
  • 7
    Option 2 (which can be written as `slice(n, n+1)` for extracting index `n`) should be the accepted answer, as it is the only one that extends naturally to the n-dimensional case. – norok2 Aug 20 '18 at 11:02
  • 1
    Option 2 seems to be able to be written as `X[10:11, :]` in Python 3.7.5 (i.e. without the extra colon after the 11) – Joe Jul 08 '20 at 20:31
18

Here's an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10:11,:]
xslice.shape   # >> (1,10)

This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.

b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape  # >> (1, 3, 4)
b[:, 2:3, :].shape .  # >> (2, 1, 4)
Andrew Schwartz
  • 4,440
  • 3
  • 25
  • 58
  • This is great. I just discovered this way of keeping dimensions myself and was going to suggest it, when I see you already posted it. I think this should be the top reply, rather than the ones above which don't really work. – Daniel Morris Apr 16 '23 at 13:22
5

To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:

X[(10,),:]
leilu
  • 367
  • 3
  • 10
1

This is especially annoying if you're indexing by an array that might be length 1 at runtime. For that case, there's np.ix_:

some_array[np.ix_(row_index,column_index)]
Jthorpe
  • 9,756
  • 2
  • 49
  • 64
0

I've been using np.reshape to achieve the same as shown below

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10,:].reshape(1, -1)
xslice.shape   # >> (1, 10)