185

I tried reading the documentation for numpy.where(), but I'm still confused.

What should I pass for the condition, x and y values? When I pass only condition, what does the result mean and how can I use it? What about when I pass all three?

I found How does python numpy.where() work? but it didn't answer my question because it seems to be about the implementation rather than about how to use it. Numpy where() on a 2D matrix also didn't explain things for me; I'm looking for a step-by-step explanation, rather than a how-to guide for a specific case.

Please include examples with both 1D and 2D source data.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Alexandre Holden Daly
  • 6,944
  • 5
  • 25
  • 36

2 Answers2

300

After fiddling around for a while, I figured things out, and am posting them here hoping it will help others.

Intuitively, np.where is like asking "tell me where in this array, entries satisfy a given condition".

>>> a = np.arange(5,10)
>>> np.where(a < 8)       # tell me where in a, entries are < 8
(array([0, 1, 2]),)       # answer: entries indexed by 0, 1, 2

It can also be used to get entries in array that satisfy the condition:

>>> a[np.where(a < 8)] 
array([5, 6, 7])          # selects from a entries 0, 1, 2

When a is a 2d array, np.where() returns an array of row idx's, and an array of col idx's:

>>> a = np.arange(4,10).reshape(2,3)
array([[4, 5, 6],
       [7, 8, 9]])
>>> np.where(a > 8)
(array(1), array(2))

As in the 1d case, we can use np.where() to get entries in the 2d array that satisfy the condition:

>>> a[np.where(a > 8)] # selects from a entries 0, 1, 2

array([9])


Note, when a is 1d, np.where() still returns an array of row idx's and an array of col idx's, but columns are of length 1, so latter is empty array.

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Alexandre Holden Daly
  • 6,944
  • 5
  • 25
  • 36
  • 14
    I was struggling on understanding np.where when used on 2d until I found your answer "When a is a 2d array, np.where() returns an array of row idx's, and an array of col idx's:". Thanks for that. – bencampbell_14 May 06 '17 at 01:56
  • 1
    I was feeling pretty stupid after reading the doc three times and still not solving the puzzle `np.where(2d_array)`, thanks for clearing this up! You should accept your own answer. e: Oh, it's closed. Well, it shouldn't be – smcs Mar 07 '18 at 09:59
  • 5
    It's a shame this was closed. However I would like to add another feature of `np.where` to this otherwise complete answer. The function can also select elements from the x and y array depending on the condition. Limited space in this comment but see: `np.where(np.array([[False,False,True], [True,False,False]]), np.array([[8,2,6], [9,5,0]]), np.array([[4,8,7], [3,2,1]]))` will return `array([[4, 8, 6], [9, 2, 1]])`. Notice which elements of x and y get chosen depending on True/False – piccolo Aug 04 '18 at 12:06
  • The explanation given in this answer is only a special case of np.where. According to the documentation, When only `condition` is provided, this function is a shorthand for ``np.asarray(condition).nonzero()``. – Lenny Jul 12 '20 at 22:03
21

Here is a little more fun. I've found that very often NumPy does exactly what I wish it would do - sometimes it's faster for me to just try things than it is to read the docs. Actually a mixture of both is best.

I think your answer is fine (and it's OK to accept it if you like). This is just "extra".

import numpy as np

a = np.arange(4,10).reshape(2,3)

wh = np.where(a>7)
gt = a>7
x  = np.where(gt)

print "wh: ", wh
print "gt: ", gt
print "x:  ", x

gives:

wh:  (array([1, 1]), array([1, 2]))
gt:  [[False False False]
      [False  True  True]]
x:   (array([1, 1]), array([1, 2]))

... but:

print "a[wh]: ", a[wh]
print "a[gt]  ", a[gt]
print "a[x]:  ", a[x]

gives:

a[wh]:  [8 9]
a[gt]   [8 9]
a[x]:   [8 9]
uhoh
  • 3,713
  • 6
  • 42
  • 95