1

I have two numpy arrays 'a' and 'b'.

'a' is shape [30000,2] and contains pairs of x,y coordinates. 'b' is of shape [10,000,000,3] and contains x,y,z coordinates.

x,y coordinate pairs from 'a' will always occur exactly once (ie uniquely) in 'b'. I want to efficiently extract the corresponding z coordinates from 'b'.

Here's a simple example...

a = np.array([[1,2], [3,4], [5,6], [8,9]]).T
b = np.array([[1,2,11], [1,3,12], [3,4,13], [4,5,14],[5,6,15], [6,7,16], [7,8,17], [8,9,18]]).T 

Would return row indices of [0,2,4,7] such that z = [11, 13, 15, 18]

Obviously this can be achieved with 2 for loops (YUCK!!!)

I'm sure this is a simple problem but it has me stumped.

What's the most efficient way to achieve this? (especially for larger datasets)

AloneTogether
  • 25,814
  • 5
  • 20
  • 39

1 Answers1

0

You can transform your 2D array into a 1D view (see this answer), then use numpy.isin:

def view1D(a, b):
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

A,B = view1D(a.T, b[:2].T)

b.T[np.isin(B, A)][:,2]
# array([11, 13, 15, 18])
mozway
  • 194,879
  • 13
  • 39
  • 75