How exactly does numpy.where() select the elements in this example?

Question

From numpy docs

>>> np.where([[True, False], [True, True]],
    ...          [[1, 2], [3, 4]],
    ...          [[9, 8], [7, 6]])
array([[1, 8],
       [3, 4]])

Am I right in assuming that the [[True, False], [True, True]] part is the condition and [[1, 2], [3, 4]] and [[9, 8], [7, 6]] are x and y respectively according to the docs parameters.

Then how exactly is the function choosing the elements in the following examples?

Also, why is the element type in these examples a list?

>>> np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])
array([list([1, 2, 56]), list([3, 4])], dtype=object)
>>> np.where([[False, False,True,True], [False, True]], [[1, 2,56,69], [3, 4]], [[9, 8,90,100], [7, 6]])
array([list([1, 2, 56, 69]), list([3, 4])], dtype=object)

You next question suggests that these answers don't satisfy you. You may need to explain what is confusing. For now stay away from that 2nd example; it will only confuse you. — hpaulj, Feb 03 '19 at 04:30

hpaulj · Answer 1 · 2019-02-03T04:42:08.457

In the first case, each term is a (2,2) array (or rather list that can be made into such an array). For each True in the condition, it returns the corresponding term in x, the [[1 -][3,4]], and for each False, the term from y [[- 8][- -]]

In the second case, the lists are ragged

In [1]: [[True, False,True], [False, True]]
Out[1]: [[True, False, True], [False, True]]
In [2]: np.array([[True, False,True], [False, True]])
Out[2]: array([list([True, False, True]), list([False, True])], dtype=object)

the array is (2,), with 2 lists. And when cast as boolean, a 2 element array, with both True. Only an empty list would produce False.

In [3]: _.astype(bool)
Out[3]: array([ True,  True])

The where then returns just the x values.

This second case is understandable, but pathological.

more details

Let's demonstrate where in more detail, with a simpler case. Same condition array:

In [57]: condition = np.array([[True, False], [True, True]])
In [58]: condition
Out[58]: 
array([[ True, False],
       [ True,  True]])

The single argument version, which is the equivalent to condition.nonzero():

In [59]: np.where(condition)
Out[59]: (array([0, 1, 1]), array([0, 0, 1]))

Some find it easier to visualize the transpose of that tuple - the 3 pairs of coordinates where condition is True:

In [60]: np.argwhere(condition)
Out[60]: 
array([[0, 0],
       [1, 0],
       [1, 1]])

Now the simplest version with 3 arguments, with scalar values.

In [61]: np.where(condition, True, False)   # same as condition
Out[61]: 
array([[ True, False],
       [ True,  True]])
In [62]: np.where(condition, 100, 200)
Out[62]: 
array([[100, 200],
       [100, 100]])

A good way of visualizing this action is with two masked assignments.

In [63]: res = np.zeros(condition.shape, int)
In [64]: res[condition] = 100
In [65]: res[~condition] = 200
In [66]: res
Out[66]: 
array([[100, 200],
       [100, 100]])

Another way to do this is to initial an array with the y value(s), and where the nonzero where to fill in the x value.

In [69]: res = np.full(condition.shape, 200)
In [70]: res
Out[70]: 
array([[200, 200],
       [200, 200]])
In [71]: res[np.where(condition)] = 100
In [72]: res
Out[72]: 
array([[100, 200],
       [100, 100]])

If x and y are arrays, not scalars, this masked assignment will require refinements, but hopefully for a start this will help.

Much thanks ,I think I understand it completely even the complex examples. — usr48, Feb 03 '19 at 06:36

score 0 · Answer 2 · answered Feb 02 '19 at 06:49

np.where(condition,x,y) It checks the condition and if its True returns x else it returns y

np.where([[True, False], [True, True]], [[1, 2], [3, 4]], [[9, 8], [7, 6]])

Here you condition is[[True, False], [True, True]] x = [[1 , 2] , [3 , 4]] y = [[9 , 8] , [7 , 6]]

First condition is true so it return 1 instead of 9

Second condition is false so it returns 8 instead of 2

score 0 · Answer 3 · answered Feb 03 '19 at 06:33

After reading about broadcasting as @hpaulj suggested I think I know how the function works. It will try to broadcast the 3 arrays,then if the broadcast was successful it will use the True and False values to pick elements either from x or y. In the example

>>>np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])

We have

cnd=np.array([[True, False,True], [False, True]])
x=np.array([[1, 2,56], [3, 4]])
y=np.array([[9, 8,79], [7, 6]])

Now

>>>x.shape
Out[7]: (2,)
>>>y.shape
Out[8]: (2,)
>>>cnd.shape
Out[9]: (2,)

So all three are just arrays with 2 elements(of type list) even the condition(cnd).So both [True, False,True] and [False, True] will be evaluated as True.And both the elements will be selected from x.

>>>np.where([[True, False,True], [False, True]], [[1, 2,56], [3, 4]], [[9, 8,79], [7, 6]])
Out[10]: array([list([1, 2, 56]), list([3, 4])], dtype=object)

I also tried it with a more complex example(a 2x2x2 broadcast) and it still explains it.

np.where([[[True,False],[True,True]], [[False,False],[True,False]]],
          [[[12,45],[10,50]], [[100,10],[17,81]]],
          [[[90,93],[85,13]], [[12,345], [190,56,34]]])

Where

cnd=np.array([[[True,False],[True,True]], [[False,False],[True,False]]])
x=np.array([[[12,45],[10,50]], [[100,10],[17,81]]])
y=np.array( [[[90,93],[85,13]], [[12,345], [190,56,34]]])

Here cnd and x have the shape (2,2,2) and y has the shape (2,2).

>>>cnd.shape
Out[14]: (2, 2, 2)
>>>x.shape
Out[15]: (2, 2, 2)
>>>y.shape
Out[16]: (2, 2)

Now as @hpaulj commented y will be broadcasted to (2,2,2). And it'll probably look like this

>>>cnd
Out[6]: 
array([[[ True, False],
        [ True,  True]],
       [[False, False],
        [ True, False]]]) 
>>>x
Out[7]: 
array([[[ 12,  45],
        [ 10,  50]],
       [[100,  10],
        [ 17,  81]]])
>>>np.broadcast_to(y,(2,2,2))
Out[8]: 
array([[[list([90, 93]), list([85, 13])],
        [list([12, 345]), list([190, 56, 34])]],
       [[list([90, 93]), list([85, 13])],
        [list([12, 345]), list([190, 56, 34])]]], dtype=object)

And the result can be easily predicted to be

>>>np.where([[[True,False],[True,True]], [[False,False],[True,False]]], [[[12,45],[10,50]], [[100,10],[17,81]]],[[[90,93],[85,13]], [[12,345], [190,56,34]]])
Out[9]: 
array([[[12, list([85, 13])],
        [10, 50]],
       [[list([90, 93]), list([85, 13])],
        [17, list([190, 56, 34])]]], dtype=object)

How exactly does numpy.where() select the elements in this example?

3 Answers3

more details

Linked