How to explain the logic process of the numpy's output?

Question

I am learning numpy from the start page : https://numpy.org/devdocs/user/quickstart.html There is a confusing part that makes me stop.

>>> a = np.arange(12).reshape(3, 4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> b1 = np.array([False, True, True])         # first dim selection
>>> b2 = np.array([True, False, True, False])  # second dim selection
>>> a[b1, b2]
array([ 4, 10])

Could you please provide any hints or explains to help me understand this logic? The output that I expect is

array([[ 4,  6],
       [ 8, 10]])

same as `b[[1,2] ,[0,2]]`. Uses `np.nonzero(b1)[0]` and `np.nonzero(b2)[0]` — hpaulj, Jun 17 '23 at 00:50
to get the block `b[[[1],[2]], [0,2] ]`. Think how the arrays/lists `broadcast` together. — hpaulj, Jun 17 '23 at 00:53

Thomas Wagenaar · Accepted Answer · 2023-06-17T19:09:40.927

An easy way to achieve what you want is to apply your masks sequentially. In other words,

a[b1,:][:,b2]

# [[ 4  6]
# [ 8 10]]

It seems like you are assuming that b1 determines which rows must be selected and b2 determines which columns must be selected. This is a fair assumption, but numpy masks broadcast differently. What is in fact happening when you pass both at the same time, is that they get converted to indices. So

b1 = np.array([False, True, True]) -> [1,2]
b2 = np.array([True, False, True, False]) -> [0,2]

Which indeed gives the "weird" behavior:

b1 = [1,2]
b2 = [0,2]

a[b1,b2] # [ 4 10]

How to explain the logic process of the numpy's output?

1 Answers1