1

I have two numpy arrays such as

import numpy as np
x = np.array([3, 1, 4])
y = np.array([4, 3, 2, 1, 0])

each containing unique values. The values in x are guaranteed to be a subset of those in y.

I would like to find the index of each element of x in the array y.

In the array above, this would be

[1, 3, 0]

So far I have been finding the indices one at a time in a loop:

idxs = []
for val in x:
     idxs.append(np.argwhere(y == val)[0,0])

But this is slow when my arrays are large.

Is there a more efficient way to do this?

Moormanly
  • 1,258
  • 1
  • 9
  • 20

2 Answers2

4

Using list.index() method improves the time significantly

y = y.tolist()
indexes = [y.index(i) for i in x]

Here is a quick timing results

import numpy as np
import timeit

x = np.array([3, 1, 4])
y = np.array([4, 3, 2, 1])

total_time = timeit.timeit('[np.argwhere(y == i)[0, 0] for i in x]',
                           'from __main__ import ' + ', '.join(globals()), number=10000)
print("using argwhere = ", total_time)

total_time = timeit.timeit('[y.tolist().index(i) for i in x]',
                           'from __main__ import ' + ', '.join(globals()), number=10000)
print("using list index = ", total_time)

using argwhere = 0.2716948229999616

using list index = 0.05231958099966505

Maxwell
  • 43
  • 3
1

Use np.isin in combination with np.argwhere

[np.argwhere(el==y)[0,0] for el in x]
pythonic833
  • 3,054
  • 1
  • 12
  • 27
  • I am hoping to preserve the order of the indices so that `all(y[idxs] == x)` should be `True`. Is there any way to modify your solution to achieve this? – Moormanly Aug 21 '19 at 22:46
  • This might be a bit slower since we are using a loop but its working and might be faster tha your first try – pythonic833 Aug 21 '19 at 22:50
  • Some quick timing experiments suggest it is about the same speed – Moormanly Aug 21 '19 at 22:53