Retrieve indexes of multiple values with Numpy in a vectorization way

Question

In order to get the index corresponding to the "99" value in a numpy array, we do :

mynumpy=([5,6,9,2,99,3,88,4,7))
np.where(my_numpy==99)

What if, I want to get the index corresponding to the following values 99,55,6,3,7? Obviously, it's possible to do it with a simple loop but I'm looking for a more vectorization solution. I know Numpy is very powerful so I think it might exist something like that.

desired output :

searched_values=np.array([99,55,6,3,7])
np.where(searched_values in mynumpy)
[(4),(),(1),(5),(8)]

check this. https://stackoverflow.com/questions/32191029/getting-the-indices-of-several-elements-in-a-numpy-array-at-once — Equinox, Mar 02 '18 at 10:28
@Kasramvd Re-opening because this question additionally needs the masking for the elements that are not present in `searched_values`. Hope that looks fair. — Divakar, Mar 02 '18 at 10:32
Yeah Divakar's solution takes into account the special case where values are not present — hans glick, Mar 02 '18 at 10:45
A similar question involving multiple values, [How to get a list of indexes selected by a specific value efficiently with numpy arrays?](https://stackoverflow.com/questions/48686381/how-to-get-a-list-of-indexes-selected-by-a-specific-value-efficiently-with-numpy). The accepted answer used a default dictionary. — hpaulj, Mar 02 '18 at 18:43

Divakar · Accepted Answer · 2018-03-02T17:58:35.890

2

Here's one approach with np.searchsorted -

def find_indexes(ar, searched_values, invalid_val=-1):
    sidx = ar.argsort()
    pidx = np.searchsorted(ar, searched_values, sorter=sidx)
    pidx[pidx==len(ar)] = 0
    idx = sidx[pidx]
    idx[ar[idx] != searched_values] = invalid_val
    return idx

Sample run -

In [29]: find_indexes(mynumpy, searched_values, invalid_val=-1)
Out[29]: array([ 4, -1,  1,  5,  8])

For a generic invalid value specifier, we could use np.where -

def find_indexes_v2(ar, searched_values, invalid_val=-1):
    sidx = ar.argsort()
    pidx = np.searchsorted(ar, searched_values, sorter=sidx)
    pidx[pidx==len(ar)] = 0
    idx = sidx[pidx]
    return np.where(ar[idx] == searched_values, idx, invalid_val)

Sample run -

In [35]: find_indexes_v2(mynumpy, searched_values, invalid_val=None)
Out[35]: array([4, None, 1, 5, 8], dtype=object)

# For list output
In [36]: find_indexes_v2(mynumpy, searched_values, invalid_val=None).tolist()
Out[36]: [4, None, 1, 5, 8]

edited Mar 02 '18 at 17:58

answered Mar 02 '18 at 10:28

Divakar

218,885
19
262
358

Waou, thks a lot, indeed it looks like to another question, but you take into account special cases so I do not know what to do. – hans glick Mar 02 '18 at 10:45
@hansglick Not sure what you meant by - "do not know what to do" thing :) – Divakar Mar 02 '18 at 10:47
Just see your stackoverflow profile, how can I send you a private message? – hans glick Mar 02 '18 at 10:53
What if it the element is present several times. I would like to be able to retrieve all the indexes. find_indexes(np.array([5,11,5]),np.array([5])) . Desired output [(0,2)] – hans glick Mar 02 '18 at 11:11
@hansglick, you should have included some duplicates in the original question. – hpaulj Mar 02 '18 at 17:47

Retrieve indexes of multiple values with Numpy in a vectorization way

1 Answers1

Linked