2

I want to write a function to calculate the Euclidean distance between coordinates in list_a to each of the coordinates in list_b, and produce an array of distances of dimension a rows by b columns (where a is the number of coordinates in list_a and b is the number of coordinates in list_b.

NB: I do not want to use any libraries other than numpy, for simplicity.

list_a = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
list_b = np.array([[0,1],[5,4]])

Running the function would generate:

>>> np.array([[0., 5.830951894845301],
              [2.236, 3.605551275463989],
              [5.830951894845301, 0.],
              [5.830951894845301, 2.8284271247461903],
              [4.123105625617661, 2.23606797749979]])

I have been trying to run the below

def run_euc(list_a,list_b):
    euc_1 = [np.subtract(list_a, list_b)]
    euc_2 = sum(sum([i**2 for i in euc_1]))
    return np.sqrt(euc_2)

But I am getting the following error:

ValueError: operands could not be broadcast together with shapes (5,2) (2,2)

Thank you.

N86808
  • 45
  • 1
  • 1
  • 6
  • Please share the entire error message. What's `euc_1 = [np.subtract(list_a, list_b)]` for? – AMC Feb 26 '20 at 17:33
  • Does this answer your question? [Minimum Euclidean distance between points in two different Numpy arrays, not within](https://stackoverflow.com/questions/1871536/minimum-euclidean-distance-between-points-in-two-different-numpy-arrays-not-wit) – AMC Feb 26 '20 at 17:33
  • How are you expecting your code to provide a Euclidean distance? You have one vector of five points, and another vector of two points. You can't subtract vectors of different lengths. – Prune Feb 26 '20 at 17:36

6 Answers6

3

Here, you can just use np.linalg.norm to compute the Euclidean distance. Your bug is due to np.subtract is expecting the two inputs are of the same length.

import numpy as np

list_a = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
list_b = np.array([[0,1],[5,4]])

def run_euc(list_a,list_b):
    return np.array([[ np.linalg.norm(i-j) for j in list_b] for i in list_a])

print(run_euc(list_a, list_b))

The code produces:

[[0.         5.83095189]
 [2.23606798 3.60555128]
 [5.83095189 0.        ]
 [5.83095189 2.82842712]
 [4.12310563 2.23606798]]
Siong Thye Goh
  • 3,518
  • 10
  • 23
  • 31
  • I really don't seem to understand the syntax for applying calculations across different sized arrays. Do you have a good online resource I could use? Trying to replicate your list comprehension across another similar problem with different array shapes and getting errors. Thank you – N86808 Feb 27 '20 at 17:37
  • when I first started writing list comprehension, I would write the for loops explicitly, then as a practice, I would write them in one line, the outest most for loop remains the outer for loop. – Siong Thye Goh Feb 27 '20 at 17:49
  • Thank you for this; but is there any way to do this without using loops? As far as I understand looping over array is computationally taxing – Constantly confused Oct 20 '21 at 09:31
  • what about the scipy solution provided by the other answer – Siong Thye Goh Oct 20 '21 at 11:00
3

I wonder what is stopping you from using Scipy. Since you are anyway using numpy, perhaps you can try using Scipy, which is not so heavy.

Why?
It has many mathematical functions with efficient implementations to make good use of your computing power.

With that in mind, here is a distance_matrix function exactly for the purpose you've mentioned.

Concretely, it takes your list_a (m x k matrix) and list_b (n x k matrix) and outputs m x n matrix with p-norm (p=2 for euclidean) distance between each pair of points across the two matrices.

from scipy.spatial import distance_matrix
distances = distance_matrix(list_a, list_b)
Furqan Rahamath
  • 2,034
  • 1
  • 19
  • 29
1

I think this works

  import numpy as np
  def distance(x,y):
      x=np.array(x)
      y=np.array(y)
      p=np.sum((x-y)**2)
      d=np.sqrt(p)
      return d
1

Using scipy, you could compute distance between each pair as follows:

import numpy as np
from scipy.spatial import distance
list_a = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
list_b = np.array([[0,1],[5,4]])
dist = distance.cdist(list_a, list_b, 'euclidean')
print(dist)

Result:

array([[0.        , 5.83095189],
       [2.23606798, 3.60555128],
       [5.83095189, 0.        ],
       [5.83095189, 2.82842712],
       [4.12310563, 2.23606798]])
Farid Alijani
  • 839
  • 1
  • 7
  • 25
0

I hope this answers the question but this is a repeat of; Minimum Euclidean distance between points in two different Numpy arrays, not within

# Import package
import numpy as np

# Define unequal matrices
xy1 = np.array([[0,1], [2,2], [5,4], [3,6], [4,2]])
xy2 = np.array([[0,1],[5,4]])

P = np.add.outer(np.sum(xy1**2, axis=1), np.sum(xy2**2, axis=1))
N = np.dot(xy1, xy2.T)
dists = np.sqrt(P - 2*N)
print(dists)
tastatham
  • 90
  • 1
  • 9
0

Another way you can do this is:

np.array(
[np.sqrt((list_a[:,1]-list_b[i,1])**2+(list_a[:,0]-list_b[i,0])**2) for i in range(len(list_b))]
).T

Output:

array([[0.        , 5.83095189],
       [2.23606798, 3.60555128],
       [5.83095189, 0.        ],
       [5.83095189, 2.82842712],
       [4.12310563, 2.23606798]])

This code can be written in much more simpler and efficient way,so if you find anything that could be improved in the code,please let me know in the comment.

Shubham Shaswat
  • 1,250
  • 9
  • 14