6

In the docs, it says (emphasis mine):

Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.

<snip>

Also recognize that x[[1,2,3]] will trigger advanced indexing, whereas x[[1,2,slice(None)]] will trigger basic slicing.

I know why x[(1, 2, slice(None))] triggers basic slicing. But why does x[[1,2,slice(None)]] trigger basic slicing, when [1,2,slice(None)] meets the condition of being a non-tuple sequence?


On a related note, why does the following occur?

>>> a = np.eye(4)
>>> a[(1, 2)]  # basic indexing, as expected
0.0
>>> a[(1, np.array(2))] # basic indexing, as expected
0.0

>>> a[[1, 2]]  # advanced indexing, as expected
array([[ 0.,  1.,  0.,  0.],
   [ 0.,  0.,  1.,  0.]])
>>> a[[1, np.array(2)]]  # basic indexing!!??
0.0
Eric
  • 95,302
  • 53
  • 242
  • 374
  • 1
    No doubt the secret lies [somewhere in the source](https://github.com/numpy/numpy/blob/7ccf0e08917d27bc0eba34013c1822b00a66ca6d/numpy/core/src/multiarray/mapping.c#L166-L721) – Eric Nov 14 '16 at 22:46
  • In the current documentation, that last quoted line has been modified to reference legacy code that's in the process of depreciation. https://numpy.org/doc/stable/user/basics.indexing.html#slicing-and-striding. – hpaulj Mar 16 '22 at 20:28

3 Answers3

9

There's an exception to that rule. The Advanced Indexing documentation section doesn't mention it, but up above, near the start of the Basic Slicing and Indexing section, you'll see the following text:

In order to remain backward compatible with a common usage in Numeric, basic slicing is also initiated if the selection object is any non-ndarray sequence (such as a list) containing slice objects, the Ellipsis object, or the newaxis object, but not for integer arrays or other embedded sequences.


a[[1, np.array(2)]] doesn't quite trigger basic indexing. It triggers an undocumented part of the backward compatibility logic, as described in a comment in the source code:

    /*
     * Sequences < NPY_MAXDIMS with any slice objects
     * or newaxis, Ellipsis or other arrays or sequences
     * embedded, are considered equivalent to an indexing
     * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
     */

The np.array(2) inside the list causes the list to be treated as if it were a tuple, but the result, a[(1, np.array(2))], is still an advanced indexing operation. It ends up applying the 1 and the 2 to separate axes, unlike a[[1, 2]], and the result ends up looking identical to a[1, 2], but if you try it with a 3D a, it produces a copy instead of a view.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • This documentation is still incomplete though, because `a[[1, np.array(2)]]` triggers basic indexing – Eric Nov 15 '16 at 00:14
  • @Eric: No it doesn't; it triggers advanced indexing. The results happen to look the same if `a` is 2D, but try it with a 3D `a`, and you'll find that the result is a copy instead of a view. – user2357112 Nov 15 '16 at 00:17
  • What i mean is that `a[[1, np.array(2)]]` is different to `a[[1, np.array(2)],]` (for `a` 2d) - the former is the same as `a[(1, 2)]`, the latter is `a[[1, 2]]`. So isn't the former basic indexing? Or is there some middle-undocumented ground? – Eric Nov 15 '16 at 00:20
  • 1
    No seriously: https://ideone.com/3jXIjz. Just because you don't think it should be different doesn't mean that it is! – Eric Nov 15 '16 at 00:25
  • 1
    wait, you're right, that is a weird interaction. I was thinking of the `1` and the `np.array(2)` as advanced indexing on separate axes, but they should be along the same axis. – user2357112 Nov 15 '16 at 00:32
  • I don't think passing indices in this format is officially supported, what with the array inside a list, but I'm not sure exactly what's happening. It seems to be treating the outer list as a tuple, as in the basic indexing backward compatibility special case, and then performing advanced indexing anyway. – user2357112 Nov 15 '16 at 00:37
  • The comment in the source pointed out by hpaulj seems to indicate that this is indeed the case. The array inside the list causes the list to be treated as a tuple, but the result is still advanced indexing due to the array. This particular special case does not seem to be documented. – user2357112 Nov 15 '16 at 00:41
  • 1
    _"it produces a copy instead of a view"_ - I wonder whether this could be considered a bug? It's normally desirable to produce a view wherever possible, right? – Eric Nov 15 '16 at 01:08
  • @Eric: I don't think so. The comment makes it clear that the list->tuple conversion was supposed to occur even in cases where the result would be advanced indexing instead of basic indexing; there's no interpretation where `a[[1, 2], [3, 4]]` could be basic indexing. – user2357112 Nov 15 '16 at 01:11
  • It's generally desirable to produce a view instead of a copy for operations that can do so consistently, but even though some advanced indexing operations could be implemented to return a view, they always make a copy for consistency's sake. – user2357112 Nov 15 '16 at 01:12
2

With a dummy class I can determine how the interpreter translates [...] into calls to __getitem__.

In [1073]: class Foo():
      ...:     def __getitem__(idx):
      ...:         print(idx)
In [1080]: Foo()[1,2,slice(None)]
(1, 2, slice(None, None, None))
In [1081]: Foo()[(1,2,slice(None))]
(1, 2, slice(None, None, None))
In [1082]: Foo()[[1,2,slice(None)]]
[1, 2, slice(None, None, None)]

So wrapping multiple terms with () makes no difference - it gets a tuple in both cases. And a list is passed as a list.

So the distinction between tuple and list (or not) must coded in numpy source code - which is compiled. So I can't readily study it.

With a 1d array

indexing with a list produces the advanced indexing - picking specific values:

In [1085]: arr[[1,2,3]]
Out[1085]: array([ 0.73703368,  0.        ,  0.        ])

but replacing one of those values with a tuple, or a slice:

In [1086]: arr[[1,2,(2,3)]]
IndexError: too many indices for array

In [1088]: arr[[1,2,slice(None)]] 
IndexError: too many indices for array

and the list is treated as a tuple - it tries matching values with dimensions.

So at a top level a list and tuple are treated the same - if the list can't interpreted as an advanced indexing list.

Notice also a difference which single item lists

In [1089]: arr[[1]]
Out[1089]: array([ 0.73703368])
In [1090]: arr[(1,)]
Out[1090]: 0.73703367969998546
In [1091]: arr[1]
Out[1091]: 0.73703367969998546

Some functions like np.apply_along/over_axis generate an index as list or array, and then apply it. They work with a list or array because it is mutable. Some then wrap it in tuple before use as index; others didn't bother. That difference sort of bothered me, but these test case indicate that such a tuple wrapped often is optional.

In [1092]: idx=[1,2,slice(None)]
In [1093]: np.ones((2,3,4))[idx]
Out[1093]: array([ 1.,  1.,  1.,  1.])
In [1094]: np.ones((2,3,4))[tuple(idx)]
Out[1094]: array([ 1.,  1.,  1.,  1.])

Looks like the tuple wrapper is still needed if I build the index as an object array:

In [1096]: np.ones((2,3,4))[np.array(idx)]
...
IndexError: arrays used as indices must be of integer (or boolean) type
In [1097]: np.ones((2,3,4))[tuple(np.array(idx))]
Out[1097]: array([ 1.,  1.,  1.,  1.])

===================

Comment from the function @Eric linked

    /*
     * Sequences < NPY_MAXDIMS with any slice objects
     * or newaxis, Ellipsis or other arrays or sequences
     * embedded, are considered equivalent to an indexing
     * tuple. (`a[[[1,2], [3,4]]] == a[[1,2], [3,4]]`)
     */

===================

This function wraps object arrays and lists in tuple for indexing:

def apply_along_axis(func1d, axis, arr, *args, **kwargs):
     ....
     ind = [0]*(nd-1)
     i = zeros(nd, 'O')
     ....
     res = func1d(arr[tuple(i.tolist())], *args, **kwargs)
     outarr[tuple(ind)] = res

update

Now this list indexing produces a FutureWarning:

In [113]: arr.shape
Out[113]: (2, 3, 4)
In [114]: arr[[1, 2, slice(None)]]
<ipython-input-114-f30c20184e42>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  arr[[1, 2, slice(None)]]
Out[114]: array([20, 21, 22, 23])

The changing the list to a tuple produces the same thing, without the warning:

In [115]: arr[(1, 2, slice(None))]
Out[115]: array([20, 21, 22, 23])

this is same thing as:

In [116]: arr[1, 2, :]
Out[116]: array([20, 21, 22, 23])

Indexing with commas creates a tuple which is passed to the __setitem__ method.

The Warning says that in the future it will try to turn the list into an array instead of a tuple:

In [117]: arr[np.array([1, 2, slice(None)])]
Traceback (most recent call last):
  Input In [117] in <module>
    arr[np.array([1, 2, slice(None)])]
IndexError: arrays used as indices must be of integer (or boolean) type

But with the slice object this raises an error. In that sense the arr[tuple([....])] interpretation is the only thing that makes sense. But it's a legacy case, left over from an earlier numeric package.

Fortunately it's unlikely that a novice programmer will try this. They may try arr[[1,2,:]], but that will give a syntax error. : is only allowed in indexing brackets, not in list brackets (or tuple () either).

This current round of comments was triggered by a differ case that produce the FutureWarning:

In [123]: arr[[[0, 1], [1, 0]]]
<ipython-input-123-4fa43c8569dd>:1: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  arr[[[0, 1], [1, 0]]]
Out[123]: 
array([[ 4,  5,  6,  7],
       [12, 13, 14, 15]])

Here the nested list interpreted as a tuple with lists, or even:

In [124]: arr[[0, 1], [1, 0]]
Out[124]: 
array([[ 4,  5,  6,  7],
       [12, 13, 14, 15]])
In [126]: arr[np.array([[0, 1], [1, 0]])].shape
Out[126]: (2, 2, 3, 4)

Same warning, but it isn't quite as obvious why the legacy code chose to take the tuple interpretation. I don't see it documented.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • _"must coded in numpy source code - which is compiled. So I can't readily study it."_ - sure you can - I linked to the relevant function in a comment under my question ;) – Eric Nov 15 '16 at 00:16
  • 1
    I did qualify it with `readily`. This function isn't as hard to decipher as some of the numpy C, but still not as testable as a Python function. – hpaulj Nov 15 '16 at 00:36
  • Fair. I'm trying to expose this in a python function in [numpy/numpy#8276](https://github.com/numpy/numpy/pull/8276), since parsing these weird corner cases is vital for any kind of `__getitem__` forwarding – Eric Nov 15 '16 at 00:38
  • Please correct me if [my answer](https://stackoverflow.com/a/71500679/5290519) on this same page is incorrect. – NeoZoom.lua Mar 16 '22 at 16:21
0

So this is my conclusion:

  1. The [1,2] apparently is a 1d-list. And in this case, the advanced indexing is triggered. So a[[1,2]] has the same result as a[[1,2],].
  2. The [1, np.array(2)] is (treated as) a 2d-list, even though np.array(2) is zero dimension. So a[[1, np.array(2)]] has the same result as a[tuple([1, np.array(2)])] and thus a[1, 2], which gives the result 0.0.
NeoZoom.lua
  • 2,269
  • 4
  • 30
  • 64
  • 1
    I don't think we need to explain why `x[[1,2,slice(None)]]` is/was treated as `x[tuple([1,2,slice(None)])]`. We just need to be aware of the issue, and avoid it where possible. – hpaulj Mar 16 '22 at 20:29