Confusion with Fancy indexing (for non-fancy people)

Question

Let's assume a multi-dimensional array

import numpy as np
foo = np.random.rand(102,43,35,51)

I know that those last dimensions represent a 2D space (35,51) of which I would like to index a range of rows of a column Let's say I want to have rows 8 to 30 of column 0 From my understanding of indexing I should call

foo[0][0][8::30][0]

Knowing my data though (unlike the random data used here), this is not what I expected

I could try this that does work but looks ridiculous

foo[0][0][[8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30],0]

Now from what I can find in this documentation I can also use something like:

foo[0][0][[8,30],0]

which only gives me the values of rows 8 and 30 while this:

foo[0][0][[8::30],0]

gives an error

File "<ipython-input-568-cc49fe1424d1>", line 1
foo[0][0][[8::30],0]
                ^
SyntaxError: invalid syntax

I don't understand why the :: argument cannot be passed here. What is then a way to indicate a range in your indexing syntax?

So I guess my overall question is what would be the proper pythonic equivalent of this syntax:

foo[0][0][[8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30],0]

It's not clear what you're trying to accomplish with `foo[0][0][[8::30],0]`. How 'bout `foo[0][0][8::30][0]`? — Matt Ball, Aug 31 '16 at 14:32
Also note that `::` is part of the extended slice syntax, no need to keep calling it "fancy indexing." `:)` http://stackoverflow.com/questions/509211/explain-pythons-slice-notation — Matt Ball, Aug 31 '16 at 14:35
@MattBall tried foo[0][0][8::30][0] but what i get seems to not be what i expect. From the values of my dataset the result seems like a row slicing — Red Sparrow, Aug 31 '16 at 14:48
@MattBall you are right about slice syntax. Text is now updated I still don't understand why oo[0][0][8::30][0] doesn't return what i expect — Red Sparrow, Aug 31 '16 at 15:00
Slice syntax is `start:stop:step`, so according to your description you need `8:30`, not `8::30`, which means every 30-th item, starting at the 8-th, all the way to the end. — Jaime, Aug 31 '16 at 15:08
@Divakar that worked indeed as mentioned also by Ophir Carmi before — Red Sparrow, Sep 01 '16 at 08:04

hpaulj · Accepted Answer · 2016-08-31T16:54:03.660

Instead of

foo[0][0][8::30][0]

try

foo[0, 0, 8:30, 0]

The foo[0][0] part is the same as foo[0, 0, :, :], selecting a 2d array (35 x 51). But foo[0][0][8::30] selects a subset of those rows

Consider what happens when is use 0::30 on 2d array:

In [490]: np.zeros((35,51))[0::30].shape
Out[490]: (2, 51)

In [491]: np.arange(35)[0::30]
Out[491]: array([ 0, 30])

The 30 is the step, not the stop value of the slice.

the last [0] then picks the first of those rows. The end result is the same as foo[0,0,0,:].

It is better, in most cases, to index multiple dimensions with the comma syntax. And if you want the first 30 rows use 0:30, not 0::30 (that's basic slicing notation, applicable to lists as well as arrays).

As for:

foo[0][0][[8::30],0]

simplify it to x[[8::30], 0]. The Python interpreter accepts [1:2:3, 0], translating it to tuple(slice(1,2,3), 0) and passing it to a __getitem__ method. But the colon syntax is accepted in a very specific context. The interpreter is treating that inner set of brackets as a list, and colons are not accepted there.

foo[0,0,[1,2,3],0]

is ok, because the inner brackets are a list, and the numpy getitem can handle those.

numpy has a tool for converting a slice notation into a list of numbers. Play with that if it is still confusing:

In [495]: np.r_[8:30]
Out[495]: 
array([ 8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
       25, 26, 27, 28, 29])
In [496]: np.r_[8::30]
Out[496]: array([0])
In [497]: np.r_[8:30:2]
Out[497]: array([ 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

Thank for the detailed explanation! Really helped to understand this especially with the `np.r_` function — Red Sparrow, Sep 01 '16 at 08:20

Confusion with Fancy indexing (for non-fancy people)

1 Answers1