-1

Consider that I have n x 64 array and m x 64 array. I want to find the best n pairs that is a result of following :

list_best=[]
for nn in range(n):
    ele_n = nx64_array[nn,:]
    ele_best = np.ones_like(ele_n)
    for mm in range(m):
        ele_m = mx64_array[mm,:]
        diff = np.sum(np.abs(ele_n - ele_m))
        if diff<ele_best : ele_best = ele_m
    list_best.append(ele_best)

But what I want to know is there is any numpy-like way to accomplish this, since for-loop is quite slow.

Is there any way to do this faster? thank you so much.

EJ Song
  • 123
  • 2
  • 13
  • @bb1 Oh this tells me that I can adjust find_nearest() for every nx64_array elements. Am I understanding correctly? – EJ Song Feb 16 '22 at 01:31
  • @EJSong I am assuming that for each row of `nx64_array` you are trying to find the closest row of `mx64_array` (although this is not quite what your code is doing). If so, then you can iterate over rows of `nx64_array` and use `find_nearest()` at each iteration. Alternatively, you can use broadcasting and vectorization to compute an nxm array with distances of all pairs of rows of the two arrays, and find closest rows based on these results. The second approach requires more memory, so what is better will depend on how large `m` and `n` are. – bb1 Feb 16 '22 at 01:46
  • @bb1 I think you are giving the right instruction for me. can you tell me how I can use broadcasting and vectorization for n x m array? All I consider is speed and both m and n is 200~400. – EJ Song Feb 16 '22 at 01:54
  • @EJSong I posted it as an answer. – bb1 Feb 16 '22 at 04:22

1 Answers1

2

You can try the following. It produces an array of the shape n x 64. Each row of this array is the row of mx64 which is the closest to the row of of the array nx64 with the same index:

import numpy as np

m = 10
n = 20

nx64 = np.random.rand(n, 64)
mx64 = np.random.rand(m, 64)

mx64[np.argmin(np.abs(nx64[:, None, :] - mx64[None, :, :]).sum(axis=-1), axis=1)]
bb1
  • 7,174
  • 2
  • 8
  • 23