I have 6 lists storing x,y,z coordinates of two sets of positions (3 lists each). I want to calculate the distance between each point in both sets. I have written my own distance function but it is slow. One of my lists has about 1 million entries. I have tried cdist, but it produces a distance matrix and I do not understand what it means. Is there another inbuilt function that can do this?
-
1Could you please show some sample input and output – thefourtheye Nov 25 '13 at 04:54
-
And your distance function as well as exact problem you're solving... Distance function in Cartesian 3D space is quite simple: `sqrt((x2 - x1)**2 + (y2 - y1)**2 + (z2 - z1)**2)`, I'm afraid there's not much to optimize. – Anatoly Scherbakov Nov 25 '13 at 05:24
-
1`One of my lists has about 1 million entries.` what about the other lists? If they have similar sizes that would be about `10^6 * 10^6 = 10^12` pairs of points and I'm afraid even built-in functions would be slow. – starrify Nov 25 '13 at 05:48
-
1And I suggest you could try writing a C module and call it in Python, or just use C or C++ entirely to do this calculation. – starrify Nov 25 '13 at 05:49
-
Check these http://stackoverflow.com/questions/6430091/efficient-distance-calculation-between-n-points-and-a-reference-in-numpy-scipy http://stackoverflow.com/questions/1401712/calculate-euclidean-distance-with-numpy – Spike Nov 25 '13 at 05:52
-
I haven't actually used it, but People often mention 'numpy' in similar questions. – XORcist Nov 25 '13 at 08:24
4 Answers
If possible, use the numpy
module to handle this kind of things. It is a lot more efficient than using regular python lists.
I am interpreting your problem like this
- You have two sets of points
- Both sets have the same number of points (
N
) - Point
k
in set 1 is related to pointk
in set 2. If each point is the coordinate of some object, I am interpreting it as set 1 containing the initial point and set 2 the point at some other time t. - You want to find the distance
d(k) = dist(p1(k), p2(k))
wherep1(k)
is point numberk
in set 1 andp2(k)
is point numberk
in set 2.
Assuming that your 6 lists are x1_coords
, y1_coords
, z1_coords
and x2_coords
, y2_coords
, z2_coords
respectively, then you can calculate the distances like this
import numpy as np
p1 = np.array([x1_coords, y1_coords, z1_coords])
p2 = np.array([x2_coords, y2_coords, z2_coords])
squared_dist = np.sum((p1-p2)**2, axis=0)
dist = np.sqrt(squared_dist)
The distance between p1(k)
and p2(k)
is now stored in the numpy array as dist[k]
.
As for speed: On my laptop with a "Intel(R) Core(TM) i7-3517U CPU @ 1.90GHz" the time to calculate the distance between two sets of points with N=1E6 is 45 ms.

- 21,229
- 9
- 65
- 75
-
3This doesn't answer the original question. What if you do not have access to numpy? – RattleyCooper May 03 '16 at 02:28
-
1If you do not have NumPy, you would do something like this: https://stackoverflow.com/a/51665185/3585557. You certainly would want to use NumPy array if you could because lists are not a great data structure for this type of calculation. – Steven C. Howell Mar 07 '19 at 13:53
Although this solution uses numpy
, np.linalg.norm
could be another solution.
Say you have one point p0 = np.array([1,2,3])
and a second point p1 = np.array([4,5,6])
. Then the quickest way to find the distance between the two would be:
import numpy as np
dist = np.linalg.norm(p0 - p1)

- 3,069
- 4
- 22
- 36
# Use the distance function in Cartesian 3D space:
# Example
import math
def distance(x1, y1, z1, x2, y2, z2):
d = 0.0
d = math.sqrt((x2 - x1)**2 + (y2 - y1)**2 + (z2 - z1)**2)
return d

- 51
- 1
- 6
-
1You certainly would want to use a NumPy array if you could because lists are not a great data structure for this type of calculation. – karel Oct 31 '21 at 04:46
You can use math.dist(A, B)
with A and B being an array of coordinates

- 1,029
- 6
- 11