Numpy: find first index of value fast

Question

How can I find the index of the first occurrence of a number in a Numpy array? Speed is important to me. I am not interested in the following answers because they scan the whole array and don't stop when they find the first occurrence:

itemindex = numpy.where(array==item)[0][0]
nonzero(array == item)[0][0]

Note 1: none of the answers from that question seem relevant Is there a Numpy function to return the first index of something in an array?

Note 2: using a C-compiled method is preferred to a Python loop.

score 65 · Accepted Answer · edited Oct 21 '13 at 07:44

65

There is a feature request for this scheduled for Numpy 2.0.0: https://github.com/numpy/numpy/issues/2269

edited Oct 21 '13 at 07:44

Jacques Kvam

2,856
1
26
31

answered Oct 05 '11 at 11:05

cyborg

9,989
4
38
56

70

Fast forward to 2018, the issue does not seem to have moved an inch. – P-Gn May 18 '18 at 11:40
17

and Numpy is still 1.x.x – Ian Lin Jan 09 '19 at 03:55
4

One pandemic later, still nothing! – Jan M. Feb 23 '23 at 10:55

score 38 · Answer 2 · answered Apr 22 '15 at 13:56

38

Although it is way too late for you, but for future reference: Using numba (1) is the easiest way until numpy implements it. If you use anaconda python distribution it should already be installed. The code will be compiled so it will be fast.

@jit(nopython=True)
def find_first(item, vec):
    """return the index of the first occurence of item in vec"""
    for i in xrange(len(vec)):
        if item == vec[i]:
            return i
    return -1

and then:

>>> a = array([1,7,8,32])
>>> find_first(8,a)
2

answered Apr 22 '15 at 13:56

tal

860
7
19

8

For python3 `xrange` need to be changed for `range`. – Mar 26 '16 at 01:02
3

Slight code improvement in Python 3+: use `enumerate`, as in `for i, v in enumerate(vec):`; `if v == item: return i`. (This is *not* a good idea in Python <=2.7, where `enumerate` creates a list rather than a basic iterator.) – acdr Jan 23 '20 at 16:20

score 28 · Answer 3 · edited Mar 18 '18 at 07:47

I've made a benchmark for several methods:

argwhere
nonzero as in the question
.tostring() as in @Rob Reilink's answer
python loop
Fortran loop

The Python and Fortran code are available. I skipped the unpromising ones like converting to a list.

The results on log scale. X-axis is the position of the needle (it takes longer to find if it's further down the array); last value is a needle that's not in the array. Y-axis is the time to find it.

The array had 1 million elements and tests were run 100 times. Results still fluctuate a bit, but the qualitative trend is clear: Python and f2py quit at the first element so they scale differently. Python gets too slow if the needle is not in the first 1%, whereas f2py is fast (but you need to compile it).

To summarize, f2py is the fastest solution, especially if the needle appears fairly early.

It's not built in which is annoying, but it's really just 2 minutes of work. Add this to a file called search.f90:

subroutine find_first(needle, haystack, haystack_length, index)
    implicit none
    integer, intent(in) :: needle
    integer, intent(in) :: haystack_length
    integer, intent(in), dimension(haystack_length) :: haystack
!f2py intent(inplace) haystack
    integer, intent(out) :: index
    integer :: k
    index = -1
    do k = 1, haystack_length
        if (haystack(k)==needle) then
            index = k - 1
            exit
        endif
    enddo
end

If you're looking for something other than integer, just change the type. Then compile using:

f2py -c -m search search.f90

after which you can do (from Python):

import search
print(search.find_first.__doc__)
a = search.find_first(your_int_needle, your_int_array)

@Eric, my guess would be that at those scales (10e-6), that's just noise in the data, and the actual per-item speed is so fast it does not meaningfully contribute to the overall time at those n < 100 or so — Brendan, Jul 21 '18 at 05:27

tstanisl · Answer 4 · 2020-04-09T09:52:22.397

If you are looking for the first non-zero element you can use a following hack:

idx = x.view(bool).argmax() // x.itemsize
idx = idx if x[idx] else -1

It is a very fast "numpy-pure" solution but it fails for some cases discussed below.

The solution takes advantage from the fact that pretty much all representation of zero for numeric types consists of 0 bytes. It applies to numpy's bool as well. In recent versions of numpy, argmax() function uses short-circuit logic when processing the bool type. The size of bool is 1 byte.

So one needs to:

create a view of the array as bool. No copy is created
use argmax() to find the first non-zero byte using short-circuit logic
recalculate the offset of this byte to the index of the first non-zero element by integer division (operator //) of the offset by a size of a single element expressed in bytes (x.itemsize)
check if x[idx] is actually non-zero to identify the case when no non-zero is present

I've made some benchmark against numba solution and build it np.nonzero.

import numpy as np
from numba import jit
from timeit import timeit

def find_first(x):
    idx = x.view(bool).argmax() // x.itemsize
    return idx if x[idx] else -1

@jit(nopython=True)
def find_first_numba(vec):
    """return the index of the first occurence of item in vec"""
    for i in range(len(vec)):
        if vec[i]:
            return i
    return -1


SIZE = 10_000_000
# First only
x = np.empty(SIZE)

find_first_numba(x[:10])

print('---- FIRST ----')
x[:] = 0
x[0] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=1000), 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=1000), 'ms')

print('---- LAST ----')
x[:] = 0
x[-1] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- NONE ----')
x[:] = 0
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- ALL ----')
x[:] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

The result on my machine are:

---- FIRST ----
ndarray.nonzero 57.63976670001284 ms
find_first 0.0010841979965334758 ms
find_first_numba 0.0002308919938514009 ms
---- LAST ----
ndarray.nonzero 58.96685277999495 ms
find_first 5.923203580023255 ms
find_first_numba 8.762269750004634 ms
---- NONE ----
ndarray.nonzero 25.13398071998381 ms
find_first 5.924289370013867 ms
find_first_numba 8.810063839919167 ms
---- ALL ----
ndarray.nonzero 55.181210660084616 ms
find_first 0.001246920000994578 ms
find_first_numba 0.00028766007744707167 ms

The solution is 33% faster than numba and it is "numpy-pure".

The disadvantages:

does not work for numpy acceptable types like object
fails for negative zero that occasionally appears in float or double computations

this is the best pure numpy solution ive tried. should be accepted answer. @tstanisl ive been trying to get a similarly fast solution to find the first zero element in an array but it always ends up slower than converting to bool then running argmin(). any ideas? — Ta946, Aug 03 '20 at 10:13
@Ta946. The trick cannot be used when looking for zero entries. E.g. non-zero double may contain a zero byte in it. If you look for numpy-pure solution try to modify my *other* answer. See https://stackoverflow.com/a/58294774/4989451. Just negate a slice of `x` before calling `nonzero()`. It will be likely slower than numba but it **will not ** search through the whole array while looking for first zero entry thus it may be fast enough for your needs. — tstanisl, Aug 03 '20 at 10:46
Great answer. Might be useful to show how you can go from any array to an array of all zeroes and ones suitable for this solution (specifically for the more common types (ints, floats, strings etc.) — Chapo, Jun 10 '21 at 12:42

score 12 · Answer 5 · answered Aug 13 '13 at 12:52

12

In case of sorted arrays np.searchsorted works.

answered Aug 13 '13 at 12:52

bubu

121
1
2

4

If array don't has this item at all array length will be returned. – Boris Tsema Mar 03 '14 at 12:23

score 11 · Answer 6 · edited Dec 11 '12 at 16:49

11

You can convert a boolean array to a Python string using array.tostring() and then using the find() method:

(array==item).tostring().find('\x01')

This does involve copying the data, though, since Python strings need to be immutable. An advantage is that you can also search for e.g. a rising edge by finding \x00\x01

edited Dec 11 '12 at 16:49

KatieK

13,586
17
76
90

answered Dec 11 '12 at 16:30

Rob Reilink

111
1
2

This is interesting, but barely faster, if at all, since you still need to deal with all the data (see my answer for a benchmark). – Mark Apr 25 '16 at 12:45

score 7 · Answer 7 · edited May 23 '17 at 11:54

7

@tal already presented a numba function to find the first index but that only works for 1D arrays. With np.ndenumerate you can also find the first index in an arbitarly dimensional array:

from numba import njit
import numpy as np

@njit
def index(array, item):
    for idx, val in np.ndenumerate(array):
        if val == item:
            return idx
    return None

Sample case:

>>> arr = np.arange(9).reshape(3,3)
>>> index(arr, 3)
(1, 0)

Timings show that it's similar in performance to tals solution:

arr = np.arange(100000)
%timeit index(arr, 5)           # 1000000 loops, best of 3: 1.88 µs per loop
%timeit find_first(5, arr)      # 1000000 loops, best of 3: 1.7 µs per loop

%timeit index(arr, 99999)       # 10000 loops, best of 3: 118 µs per loop
%timeit find_first(99999, arr)  # 10000 loops, best of 3: 96 µs per loop

edited May 23 '17 at 11:54

Community

1
1

answered Jan 19 '17 at 02:20

MSeifert

145,886
38
333
352

1

If you are furthermore interested in searching along a given axis first: Transpose `array` before feeding it into `np.ndenumerate`, such that your axis of interest comes first. – CheshireCat Jul 17 '19 at 13:13
Thanks, this is indeed orders of magnitude faster: from ~171ms (`np.argwhere`) to 717ns (your solution), both for an array of shape `(3000000, 12)`). – Arthur Colombini Gusmão Jan 27 '20 at 02:07

Brian Larsen · Answer 8 · 2011-10-04T23:11:57.787

I think you have hit a problem where a different method and some a priori knowledge of the array would really help. The kind of thing where you have a X probability of finding your answer in the first Y percent of the data. The splitting up the problem with the hope of getting lucky then doing this in python with a nested list comprehension or something.

Writing a C function to do this brute force isn't too hard using ctypes either.

The C code I hacked together (index.c):

long index(long val, long *data, long length){
    long ans, i;
    for(i=0;i<length;i++){
        if (data[i] == val)
            return(i);
    }
    return(-999);
}

and the python:

# to compile (mac)
# gcc -shared index.c -o index.dylib
import ctypes
lib = ctypes.CDLL('index.dylib')
lib.index.restype = ctypes.c_long
lib.index.argtypes = (ctypes.c_long, ctypes.POINTER(ctypes.c_long), ctypes.c_long)

import numpy as np
np.random.seed(8675309)
a = np.random.random_integers(0, 100, 10000)
print lib.index(57, a.ctypes.data_as(ctypes.POINTER(ctypes.c_long)), len(a))

and I get 92.

Wrap up the python into a proper function and there you go.

The C version is a lot (~20x) faster for this seed (warning I am not good with timeit)

import timeit
t = timeit.Timer('np.where(a==57)[0][0]', 'import numpy as np; np.random.seed(1); a = np.random.random_integers(0, 1000000, 10000000)')
t.timeit(100)/100
# 0.09761879920959472
t2 = timeit.Timer('lib.index(57, a.ctypes.data_as(ctypes.POINTER(ctypes.c_long)), len(a))', 'import numpy as np; np.random.seed(1); a = np.random.random_integers(0, 1000000, 10000000); import ctypes; lib = ctypes.CDLL("index.dylib"); lib.index.restype = ctypes.c_long; lib.index.argtypes = (ctypes.c_long, ctypes.POINTER(ctypes.c_long), ctypes.c_long) ')
t2.timeit(100)/100
# 0.005288000106811523

If the array is doubles (remember python floats are C doubles by default) then you have to think a bit harder as == is not really safe or what you want for floating point values. Also don't forget that it is a really good idea when using ctypes to type your numpy arrays. — Brian Larsen, Oct 06 '11 at 17:02
Thanks @Brian Larsen . I might give it a try. I think it's a trivial feature request for the next numpy revision. — cyborg, Oct 12 '11 at 07:08

score 5 · Answer 9 · answered Oct 08 '19 at 22:24

This problem can be effectively solved in pure numpy by processing the array in chunks:

def find_first(x):
    idx, step = 0, 32
    while idx < x.size:
        nz, = x[idx: idx + step].nonzero()
        if len(nz): # found non-zero, return it
            return nz[0] + idx
        # move to the next chunk, increase step
        idx += step
        step = min(9600, step + step // 2)
    return -1

The array is processed in chunk of size step. The step longer the step is, the faster is processing of zeroed-array (worst case). The smaller it is, the faster processing of array with non-zero at the start. The trick is to start with a small step and increase it exponentially. Moreover, there is no need to increment it above some threshold due to limited benefits.

I've compared the solution with pure ndarary.nonzero and numba solution against 10 million array of floats.

import numpy as np
from numba import jit
from timeit import timeit

def find_first(x):
    idx, step = 0, 32
    while idx < x.size:
        nz, = x[idx: idx + step].nonzero()
        if len(nz):
            return nz[0] + idx
        idx += step
        step = min(9600, step + step // 2)
    return -1

@jit(nopython=True)
def find_first_numba(vec):
    """return the index of the first occurence of item in vec"""
    for i in range(len(vec)):
        if vec[i]:
            return i
    return -1


SIZE = 10_000_000
# First only
x = np.empty(SIZE)

find_first_numba(x[:10])

print('---- FIRST ----')
x[:] = 0
x[0] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=1000), 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=1000), 'ms')

print('---- LAST ----')
x[:] = 0
x[-1] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- NONE ----')
x[:] = 0
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

print('---- ALL ----')
x[:] = 1
print('ndarray.nonzero', timeit(lambda: x.nonzero()[0][0], number=100)*10, 'ms')
print('find_first', timeit(lambda: find_first(x), number=100)*10, 'ms')
print('find_first_numba', timeit(lambda: find_first_numba(x), number=100)*10, 'ms')

And results on my machine:

---- FIRST ----
ndarray.nonzero 54.733994480002366 ms
find_first 0.0013148509997336078 ms
find_first_numba 0.0002839310000126716 ms
---- LAST ----
ndarray.nonzero 54.56336712999928 ms
find_first 25.38929685000312 ms
find_first_numba 8.022820680002951 ms
---- NONE ----
ndarray.nonzero 24.13432420999925 ms
find_first 25.345200140000088 ms
find_first_numba 8.154927100003988 ms
---- ALL ----
ndarray.nonzero 55.753537260002304 ms
find_first 0.0014760300018679118 ms
find_first_numba 0.0004358099977253005 ms

Pure ndarray.nonzero is definite looser. The numba solution is circa 5 times faster for the best case. It is circa 3 times faster in the worst case.

score 3 · Answer 10 · answered Nov 30 '12 at 16:47

3

If your list is sorted, you can achieve very quick search of index with the 'bisect' package. It's O(log(n)) instead of O(n).

bisect.bisect(a, x)

finds x in the array a, definitely quicker in the sorted case than any C-routine going through all the first elements (for long enough lists).

It's good to know sometimes.

answered Nov 30 '12 at 16:47

ngrislain

944
12
19

```>>> cond = "import numpy as np;a = np.arange(40)"``` `timeit("np.searchsorted(a, 39)", cond)` works for 3.47867107391 seconds. `timeit("bisect.bisect(a, 39)", cond2)` works for 7.0661458969116 seconds. It looks like `numpy.searchsorted` is better for sorted arrays (at least for ints). – Boris Tsema Mar 03 '14 at 12:17

score 2 · Answer 11 · answered Jul 25 '12 at 19:02

I needed this for my job so I taught myself Python and Numpy's C interface and wrote my own. http://pastebin.com/GtcXuLyd It's only for 1-D arrays, but works for most data types (int, float, or strings) and testing has shown it is again about 20 times faster than the expected approach in pure Python-numpy.

score 2 · Answer 12 · answered Oct 05 '11 at 05:05

As far as I know only np.any and np.all on boolean arrays are short-circuited.

In your case, numpy has to go through the entire array twice, once to create the boolean condition and a second time to find the indices.

My recommendation in this case would be to use cython. I think it should be easy to adjust an example for this case, especially if you don't need much flexibility for different dtypes and shapes.

A Roebel · Answer 13 · 2019-05-07T17:55:53.333

As a longtime matlab user I a have been searching for an efficient solution to this problem for quite a while. Finally, motivated by discussions a propositions in this thread I have tried to come up with a solution that is implementing an API similar to what was suggested here, supporting for the moment only 1D arrays.

You would use it like this

import numpy as np
import utils_find_1st as utf1st
array = np.arange(100000)
item = 1000
ind = utf1st.find_1st(array, item, utf1st.cmp_larger_eq)

The condition operators supported are: cmp_equal, cmp_not_equal, cmp_larger, cmp_smaller, cmp_larger_eq, cmp_smaller_eq. For efficiency the extension is written in c.

You find the source, benchmarks and other details here:

https://pypi.python.org/pypi?name=py_find_1st&:action=display

for the use in our team (anaconda on linux and macos) I have made an anaconda installer that simplifies installation, you may use it as described here

https://anaconda.org/roebel/py_find_1st

_"As a longtime matlab user"_ - what is the matlab spelling for this? — Eric, Nov 08 '19 at 00:36
find (X, n) finds the first n indices where X is non zero. https://www.mathworks.com/help/matlab/ref/find.html — A Roebel, Dec 28 '19 at 20:48

score 0 · Answer 14 · answered Mar 07 '14 at 16:56

Just a note that if you are doing a sequence of searches, the performance gain from doing something clever like converting to string, might be lost in the outer loop if the search dimension isn't big enough. See how the performance of iterating find1 that uses the string conversion trick proposed above and find2 that uses argmax along the inner axis (plus an adjustment to ensure a non-match returns as -1)

import numpy,time
def find1(arr,value):
    return (arr==value).tostring().find('\x01')

def find2(arr,value): #find value over inner most axis, and return array of indices to the match
    b = arr==value
    return b.argmax(axis=-1) - ~(b.any())


for size in [(1,100000000),(10000,10000),(1000000,100),(10000000,10)]:
    print(size)
    values = numpy.random.choice([0,0,0,0,0,0,0,1],size=size)
    v = values>0

    t=time.time()
    numpy.apply_along_axis(find1,-1,v,1)
    print('find1',time.time()-t)

    t=time.time()
    find2(v,1)
    print('find2',time.time()-t)

outputs

(1, 100000000)
('find1', 0.25300002098083496)
('find2', 0.2780001163482666)
(10000, 10000)
('find1', 0.46200013160705566)
('find2', 0.27300000190734863)
(1000000, 100)
('find1', 20.98099994659424)
('find2', 0.3040001392364502)
(10000000, 10)
('find1', 206.7590000629425)
('find2', 0.4830000400543213)

That said, a find written in C would be at least a little faster than either of these approaches

score 0 · Answer 15 · answered Apr 01 '16 at 13:05

0

how about this

import numpy as np
np.amin(np.where(array==item))

answered Apr 01 '16 at 13:05

nkvnkv

914
2
12
25

2

While this code may answer the question, providing additional context regarding _why_ and/or _how_ it answers the question would significantly improve its long-term value. Please [edit] your answer to add some explanation. – Toby Speight Apr 01 '16 at 14:14
1

I'm pretty sure this is even slower than `where(array==item)[0][0]` from the question... – Mark Apr 25 '16 at 12:52

score -1 · Answer 16 · answered Oct 23 '12 at 02:38

-1

You can covert your array into a list and use it's index() method:

i = list(array).index(item)

As far as I'm aware, this is a C compiled method.

answered Oct 23 '12 at 02:38

drevicko

14,382
15
75
97

3

this is likely to be many times slower than just taking the first result from np.where – cwa Dec 24 '12 at 17:43
1

very true.. I used `timeit()` on an array of 10000 integers -- converting to a list was about 100 times slower! I had forgotten that the underlying data structure for a numpy array is very different from a list.. – drevicko Jan 02 '13 at 00:58

Numpy: find first index of value fast

16 Answers16

Linked

Related