1

Say I have two arrays, same size, no duplicates, each item in array 1 is in array2:

arr1 = np.array([100,200,50,150])
arr2 = np.array([150,200,100,50])

What is the best way to find an index map inds such that arr2[inds] returns arr1?

My current solution works, but I was wondering if there was something more numpyish that would be more efficient on large arrays:

ind21 = map(lambda x:np.abs(x-arr2).argmin(),arr1)

In [57]: arr1,arr2[ind21]
Out[57]: (array([100, 200,  50, 150]), array([100, 200,  50, 150]))
Michael
  • 486
  • 6
  • 19

2 Answers2

1

OK, answering my own question, this is very fast:

ind1 = np.argsort(arr1)
indrev1 = np.argsort(ind1)
ind2 = np.argsort(arr2)
ind21 = ind2[indr1]

In [101]: arr1,arr2[ind21]
Out[101]: (array([100, 200,  50, 150]), array([100, 200,  50, 150]))
Michael
  • 486
  • 6
  • 19
  • Lovely find that one and must be quite efficient! Very similar circumstances for a 2D case lead me to use `argsort` for another solution - http://stackoverflow.com/a/36536068/3293881 – Divakar Apr 14 '16 at 19:00
0

The numpy_indexed package disclaimer: I am its author) provides a simple and fully vectorized solution to this problem:

import numpy_indexed as npi
ind = npi.indices(arr1, arr2)

Probably a little slower than your solution, since npi strives to be much more general, and it does not exploit the simple structure of your problem fully... though the total cost will be dominated by the same argsorts going on behind the scenes.

Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42