I have a matrix of distance between 12000 atoms (pairwise euclidean distance). And I want to convert that to a list of nodes adjacency, with the ith element of the list being a list of nodes that are within a threshold distance. For example if I have three points :
(0,0) (1,0) (1,1)
I will have the matrix :
[[0. 1. 1.41421356]
[1. 0. 1. ]
[1.41421356 1. 0. ]]
Then all pairs that satisfy the condition; distance <= 1; will be:
[[0 0]
[0 1]
[1 0]
[1 1]
[1 2]
[2 1]
[2 2]]
Then finally the adjacency list will be:
[[0,1],[0,1,2],[1,2]]
Here is a code that work:
from scipy.spatial import distance
import numpy as np
def voisinage(xyz):
#xyz is a vector of positions in 3d space
# matrice de distance
dist = distance.cdist(xyz,xyz,'euclidean')
# extract i,j pairs where distance < threshold
paires = np.argwhere(dist<threshold)
# prepare the adjacency list
Vvoisinage = [[] for i in range(len(xyz))]
# fill the adjacency list
for p in paires:
Vvoisinage[p[0]].append(p[1])
This code runs in approximately 4 to 5 seconds for around 12100 points in 3D space. I want to make it as fast as possible because it needs to run for thousands of sets of 12100 points, and there are other calculations for each set. I have tried to use networkX but it's a lot slower than this method.
The section to optimize is the last one because it takes in average 2.7 seconds so half the computation time.
Also maybe there is a faster way of doing all of this.
Thanks