Behavior of np.take with Boolean arrays

Question

np.take is a function that takes the elements from an array along an axis. When there is an axis parameter specified it behaves exactly like "fancy" indexing (indexing using arrays), but without an axis parameter the array a is flattened, and then the indices taken from this array. However, the documentation gives nothing on boolean indexing and it is unspecified what the behavior is with boolean arrays: https://numpy.org/doc/stable/reference/generated/numpy.take.html

The code in question is the following

Input:

a = np.array([2, 3])
b = np.array([[False,  True],
              [ True, False],
              [False,  True],
              [ True, False]])

a.take(b)

Output:

array([[2, 3],
       [3, 2],
       [2, 3],
       [3, 2]])

In this particular code how is the output being switched when we have a column of [True, False] and stays the same with a column of [False, True]? Now when I try this with boolean indexing where I have the index as a boolean array I get an error:

Input:

a = np.array([2, 3])
b = np.array([[False,  True],
              [ True, False],
              [False,  True],
              [ True, False]])

a[b]

Output:

IndexError                                Traceback (most recent call last)
<ipython-input-6-8b64b196a893> in <cell line: 7>()
      5               [ True, False]])
      6 
----> 7 a[b]

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

This error makes sense to me because the index array (b) has two dimensions while the array a has only one. So what is np.take doing so that it just switches the columns in each row?

I believe `take` doesn't implement boolean indexing. The first example is treated as integer array indexing, along the lines of `a[b.astype(int)]` — Brian61354270, Aug 21 '23 at 16:23
Hello Brain, Oh, so you are saying that the b array would basically be treated as: b = np.array([[0, 1], [ 1, 0], [0, 1], [ 1, 0]]) and it would take the indices? — Karl Gardner, Aug 21 '23 at 18:08
It sure looks that way, doesn't it? The returned values are consistent with alternating 0/1 indexing. — hpaulj, Aug 21 '23 at 18:53
`take` is most useful when the array has many (>2) dimensions, and you want to index without a awkward mix of slices and ellipses. Or if you want to take advantage of the wrap or clip modes. In sum, some sort of code generated indices. — hpaulj, Aug 21 '23 at 18:57

score 1 · Answer 1 · answered Aug 22 '23 at 08:13

In the documentation, it is mentioned that " If indices is not one dimensional, the output also has these dimensions.", So the output will be in the shape of indices.
The True and False in array 'b' act as 1 and 0 respectively, i.e. index 1 and 0. So if I replace it with this value, I get the same output.

import numpy as np
a = np.array([2, 3])
#b = np.array([[False,  True],
#              [ True, False],
#              [False,  True],
#              [ True, False]])
b = np.array([[0,  1],
              [ 1, 0],
              [0,  1],
              [ 1, 0]])              

c = a.take(b)
print(c)

output

[[2 3]
 [3 2]
 [2 3]
 [3 2]]

And if add a value , say '3' in the array b, will get 'IndexError: index 3 is out of bounds for axis 0 with size 2'. Because size of a is 2.

import numpy as np
a = np.array([2, 3])
b = np.array([[True,  True],
              [ False, 3]])              


c = a.take(b)
print(c)

output

    c = a.take(b)
IndexError: index 3 is out of bounds for axis 0 with size 2

Behavior of np.take with Boolean arrays

1 Answers1