4

What's the fastest way of returning the index of the FIRST match between a variable and an element within an ndarray? I see numpy.where used a lot, but that returns all indices.

match = 5000
zArray = np.array([[0,1200,200],[1320,24,5000],[5000,234,5230]])

>array([[   0, 1200,  200],
   [1320,   24, 5000],
   [5000,  234, 5230]])

numpy.where(zArray==match)
>(array([1, 2], dtype=int64), array([2, 0], dtype=int64))

I'd like the first index returned, i.e. just [1,2]. but numpy.where returns both [1,2] and [2,0]

kabammi
  • 326
  • 3
  • 14
  • How would you define the *first* match ? Row major or column major ? – ZdaR Oct 25 '17 at 03:57
  • thanks guys, I need to clarify this a bit.. – kabammi Oct 25 '17 at 04:02
  • 1
    Be aware that your example array happens to be set such, that the two arrays in the result appear to be the two x, y index pairs you're looking for. Instead, these are the [x1, x2] and [y1, y2] indices of the matches. Try e.g. `[[0,5000,200],[1320,24,1200],[234,5000,5230]]` instead to see. –  Oct 25 '17 at 04:11
  • Just to iterate on what @Evert has mentioned, `"I'd like the first index returned, i.e. just [1,2]. but numpy.where returns both [1,2] and [2,0]"` needs edits I believe. – Divakar Oct 25 '17 at 04:29

1 Answers1

4

You can use np.argwhere to get the matching indices packed as a 2D array with each row holding indices for each match and then index into the first row, like so -

np.argwhere(zArray==match)[0]

Alternatively, faster one with argmax to get the index of the first match on a flattened version and np.unravel_index for per-dim indices tuple -

np.unravel_index((zArray==match).argmax(), zArray.shape)

Sample run -

In [100]: zArray
Out[100]: 
array([[   0, 1200, 5000], # different from sample for a generic one
       [1320,   24, 5000],
       [5000,  234, 5230]])

In [101]: match
Out[101]: 5000

In [102]: np.argwhere(zArray==match)[0]
Out[102]: array([0, 2])

In [103]: np.unravel_index((zArray==match).argmax(), zArray.shape)
Out[103]: (0, 2)

Runtime test -

In [104]: a = np.random.randint(0,100,(1000,1000))

In [105]: %timeit np.argwhere(a==50)[0]
100 loops, best of 3: 2.41 ms per loop

In [106]: %timeit np.unravel_index((a==50).argmax(), a.shape)
1000 loops, best of 3: 493 µs per loop
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • `argwhere` is just `transpose(where(...))`. It's handy for getting the first occurrence, but doesn't do any sort of short-circuiting. `argmax` might short circuit the boolean match (I known the `nan` test does). – hpaulj Oct 25 '17 at 05:13
  • thanks Divakar. This works well, and is perfect for my needs. – kabammi Oct 30 '17 at 03:07