What you want is an equivalent of numpy.in1d, for 2-dimensional matrices. I wrote such a function a while ago
def in2d(arr1, arr2):
"""Generalisation of numpy.in1d to 2D arrays"""
assert arr1.dtype == arr2.dtype
arr1_view = np.ascontiguousarray(arr1).view(np.dtype((np.void,
arr1.dtype.itemsize * arr1.shape[1])))
arr2_view = np.ascontiguousarray(arr2).view(np.dtype((np.void,
arr2.dtype.itemsize * arr2.shape[1])))
intersected = np.in1d(arr1_view, arr2_view)
return intersected.view(np.bool).reshape(-1)
Explanation on how it works can be found here.
You can use the function like this
In [56]: a = np.array([[1,3,4],[2,5,3],[2,4,6],[6,5,3]])
In [57]: b = np.array([[2,4,5],[2,4,6],[1,3,4]])
In [58]: in2d(b,a)
Out[58]: array([False, True, True], dtype=bool)
It returns an array of boolean of which elements of b
are in a
. Or vice versa
In [59]: in2d(a,b)
Out[59]: array([ True, False, True, False], dtype=bool)
Indexing a with this boolean array gives you exactly what you want
In [60]: a[in2d(a,b),:]
Out[60]:
array([[1, 3, 4],
[2, 4, 6]])
Note that your solution (posted below), is incorrect and does not do what you think it does, in that if v in a
searches all nested arrays/lists elements. So the following comparison is not fair, nevertheless, consider
def for_loop_and_compare(a,b):
return np.array([v for v in b if v in a])
And the timings
In [61]: a=np.random.randint(0,100,(10000,3))
In [62]: b=np.random.randint(0,100,(1000,3))
In [63]: %timeit for_loop_and_compare(a,b)
10 loops, best of 3: 79 ms per loop
In [64]: %timeit in2d(a,b)
100 loops, best of 3: 3.7 ms per loop