11

I have 6 lists storing x,y,z coordinates of two sets of positions (3 lists each). I want to calculate the distance between each point in both sets. I have written my own distance function but it is slow. One of my lists has about 1 million entries. I have tried cdist, but it produces a distance matrix and I do not understand what it means. Is there another inbuilt function that can do this?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Abhinav Kumar
  • 1,613
  • 5
  • 20
  • 33
  • 1
    Could you please show some sample input and output – thefourtheye Nov 25 '13 at 04:54
  • And your distance function as well as exact problem you're solving... Distance function in Cartesian 3D space is quite simple: `sqrt((x2 - x1)**2 + (y2 - y1)**2 + (z2 - z1)**2)`, I'm afraid there's not much to optimize. – Anatoly Scherbakov Nov 25 '13 at 05:24
  • 1
    `One of my lists has about 1 million entries.` what about the other lists? If they have similar sizes that would be about `10^6 * 10^6 = 10^12` pairs of points and I'm afraid even built-in functions would be slow. – starrify Nov 25 '13 at 05:48
  • 1
    And I suggest you could try writing a C module and call it in Python, or just use C or C++ entirely to do this calculation. – starrify Nov 25 '13 at 05:49
  • Check these http://stackoverflow.com/questions/6430091/efficient-distance-calculation-between-n-points-and-a-reference-in-numpy-scipy http://stackoverflow.com/questions/1401712/calculate-euclidean-distance-with-numpy – Spike Nov 25 '13 at 05:52
  • I haven't actually used it, but People often mention 'numpy' in similar questions. – XORcist Nov 25 '13 at 08:24

4 Answers4

15

If possible, use the numpy module to handle this kind of things. It is a lot more efficient than using regular python lists.

I am interpreting your problem like this

  1. You have two sets of points
  2. Both sets have the same number of points (N)
  3. Point k in set 1 is related to point k in set 2. If each point is the coordinate of some object, I am interpreting it as set 1 containing the initial point and set 2 the point at some other time t.
  4. You want to find the distance d(k) = dist(p1(k), p2(k)) where p1(k) is point number k in set 1 and p2(k) is point number k in set 2.

Assuming that your 6 lists are x1_coords, y1_coords, z1_coords and x2_coords, y2_coords, z2_coords respectively, then you can calculate the distances like this

import numpy as np
p1 = np.array([x1_coords, y1_coords, z1_coords])
p2 = np.array([x2_coords, y2_coords, z2_coords])

squared_dist = np.sum((p1-p2)**2, axis=0)
dist = np.sqrt(squared_dist)

The distance between p1(k) and p2(k) is now stored in the numpy array as dist[k].

As for speed: On my laptop with a "Intel(R) Core(TM) i7-3517U CPU @ 1.90GHz" the time to calculate the distance between two sets of points with N=1E6 is 45 ms.

Hannes Ovrén
  • 21,229
  • 9
  • 65
  • 75
  • 3
    This doesn't answer the original question. What if you do not have access to numpy? – RattleyCooper May 03 '16 at 02:28
  • 1
    If you do not have NumPy, you would do something like this: https://stackoverflow.com/a/51665185/3585557. You certainly would want to use NumPy array if you could because lists are not a great data structure for this type of calculation. – Steven C. Howell Mar 07 '19 at 13:53
10

Although this solution uses numpy, np.linalg.norm could be another solution.

Say you have one point p0 = np.array([1,2,3]) and a second point p1 = np.array([4,5,6]). Then the quickest way to find the distance between the two would be:

import numpy as np

dist = np.linalg.norm(p0 - p1)
azizbro
  • 3,069
  • 4
  • 22
  • 36
2
# Use the distance function in Cartesian 3D space:
# Example
import math     
def distance(x1, y1, z1, x2, y2, z2):
d = 0.0
d = math.sqrt((x2 - x1)**2 + (y2 - y1)**2 + (z2 - z1)**2)
return d
  • 1
    You certainly would want to use a NumPy array if you could because lists are not a great data structure for this type of calculation. – karel Oct 31 '21 at 04:46
2

You can use math.dist(A, B) with A and B being an array of coordinates

Benjamin Merchin
  • 1,029
  • 6
  • 11