5

We noticed that the mixed usage of fancy indexing and slicing is so confusing and undocumented for multi-dimensional arrays, for example:

In [114]: x = np.arange(720).reshape((2,3,4,5,6))

In [115]: x[:,:,:,0,[0,1,2,4,5]].shape
Out[115]: (2, 3, 4, 5)

In [116]: x[:,:,0,:,[0,1,2,4,5]].shape
Out[116]: (5, 2, 3, 5)

I have read the usage of fancy indexing on https://numpy.org/doc/stable/user/basics.indexing.html and I can understand that x[:,0,:,[1,2]] = [x[:,0,:,1], x[:,0,:,2]]. However I cannot understand why the result for above Input [115] and Input [116] differ on the first dimension. Can someone point to where such broadcasting rules are documented?

Thanks!

I have tried searching the documentation for fancy indexing as well as posting issues to the numpy repo on Github.

ShadowCrafter_01
  • 533
  • 4
  • 19
sighingnow
  • 791
  • 5
  • 11
  • Because of improvements in documentation we should resist efforts to mark this question as a duplicate. – hpaulj Jun 14 '23 at 09:56

2 Answers2

3

There are two parts to the indexing operation, the subspace defined by the basic indexing (excluding integers) and the subspace from the advanced indexing part. Two cases of index combination need to be distinguished:

  • The advanced indices are separated by a slice, Ellipsis or newaxis. For example x[arr1, :, arr2].
  • The advanced indices are all next to each other. For example x[..., arr1, arr2, :] but not x[arr1, :, 1] since 1 is an advanced index in this regard.

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

From https://numpy.org/doc/stable/user/basics.indexing.html#combining-advanced-and-basic-indexing

In your first example, there is advanced indexing in the fourth and fifth dimension (advanced indexing is in two dimensions "next to each other"). In your second example, there is a slicing operation (basic indexing) separating your two advanced indexing.

rochard4u
  • 629
  • 3
  • 17
  • 1
    Thanks for pointing to the documentation. By the way, do you know why numpy yields such a bit of counter-intuitive behavior? I have tried Pytorch and it behaves differently (but is easy to understand) with Numpy. – sighingnow Jun 14 '23 at 09:34
  • The documentation on mixed advanced and basic is new since last I tried to answer this. In the past it just said that due ambiguity, the alice dimensions are placed last. – hpaulj Jun 14 '23 at 09:50
  • "because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use .transpose() to move the subspace anywhere desired.". There is apparently an ambiguity in the process, but I cannot locate it. – rochard4u Jun 14 '23 at 12:28
  • @rochard4u, the ambiguity is more apparent when the slice separates advanced indexing arrays. – hpaulj Jun 14 '23 at 13:44
2

Some additional insight into why there is ambiguity:

In the latter case in the question, the 3rd and 5th axes are indexed, and thus disappear from the new array. A new axis (with shape equal to the broadcasting of the indices) has to be added somewhere. If I was numpy, and had to insert a shape (5,) array into the array with "shape" (2, 3, -, 5, -), would I place it in place of the first missing dimension? Or the second?

Exactly because a slice separates two advanced indices, numpy can not replace a consecutive set of axes, and thus not know whether to insert the new axis before or after the separating slice(s). As a result, the new axis is inserted at the front:

(5, 2, 3, 5)
 ^  ^^^^^^^--- old dimensions
 |
new dimension

Only in the first case, where the disappearing axes are all adjacent ("shape" (2, 3, 4, -, -)), can the new axes be unambiguously inserted at the end.

Note: Behind the scenes numpy always inserts the new axes at the start. It just (mostly for convenience I believe) transposes the array to move the new axes into place when unambiguous.

Also interesing is Numpy Enhancement Proposal 21

Chrysophylaxs
  • 5,818
  • 3
  • 10
  • 21