I have a numpy array that has 10,000 vectors with 3,000 elements in each. I want to return the top 10 indices of the closest pairs with the distance between them. So if row 5 and row 7 have the closest euclidean distance of 0.005, and row 8 and row 10 have the second closest euclidean distance of 0.0052 then I want to return [(8,10,.0052),(5,7,.005)]. The traditional for loop method is very slow. Is there an alternative quicker approach for a way to get euclidean neighbors of large features vectors (stored as np array)?
I'm doing the following:
l = []
for i in range(0,M.shape[0]):
for j in range(0,M.shape[0]):
if i != j and i > j:
l.append( (i,j,euc(M[i],M[j]))
return l
Here euc is a function to calculate euclidean distances between two vectors of a matrix using scipy. Then I sort l and pull out the top 10 closest distances