Suppose you have two 2D arrays A and B, and you want to check, where a row of A is contained in B. How do you do this most efficiently using numpy?
E.g.
a = np.array([[1,2,3],
[4,5,6],
[9,10,11]])
b = np.array([[4,5,6],
[4,3,2],
[1,2,3],
[4,8,9]])
map = [[0,2], [1,0]] # row 0 of a is at row index 2 of array B
I know how to check if a row of A is in B using in1d
(test for membership in a 2d numpy array), but this does not yield the indices map.
The purpose of this map is to (finally) merge the two arrays together based on some columns.
Of course one could do this row by row, but this gets very inefficient, since my arrays have the shape (50 Mio., 20).
An alternative would be to use the pandas merge function, but I'd like to do this using numpy only.