3

I have 2 numpy arrays (say X and Y) which each row represents a point vector.
I would like to find the squared euclidean distances (will call this 'dist') between each point in X to each point in Y.
I would like the output to be a matrix D where D(i,j) is dist(X(i) , Y(j)).

I have the following python code based on : http://nonconditional.com/2014/04/on-the-trick-for-computing-the-squared-euclidian-distances-between-two-sets-of-vectors/

def get_sq_distances(X, Y):
    a = np.sum(np.square(X),axis=1,keepdims=1)
    b = np.ones((1,Y.shape[0]))
    c = a.dot(b)
    a = np.ones((X.shape[0],1))
    b = np.sum(np.square(Y),axis=1,keepdims=1).T
    c += a.dot(b)
    c -= 2*X.dot(Y.T)
    return c

I'm trying to avoid loops (should I?) and to use matrix multiplication in order to do a fast computation.

But I have the problem with "Memory Error" on large arrays. Maybe there is a better way to do this?

blackraven
  • 5,284
  • 7
  • 19
  • 45
member555
  • 797
  • 1
  • 13
  • 40
  • @cel thats only between 2 points – member555 Nov 17 '15 at 15:26
  • 1
    @cel that's nice but its between all points in one matrix.. I have two matrices. Also, distance between points from the same matrix is not necessary for me – member555 Nov 17 '15 at 15:33
  • You can use cel's info if you concatenate X and Y, although that will not be efficient since you will be computing X-to-X and Y-to-Y distances along the way. – Mad Physicist Nov 17 '15 at 15:43
  • 1
    Almost an exact duplicate of http://stackoverflow.com/q/1871536/1461210 (albeit for squared euclidean distance) – ali_m Nov 17 '15 at 16:16
  • I cleaned up my attempts. They were not really helpful. – cel Nov 17 '15 at 16:22

2 Answers2

8

Scipy has the cdist function that does exactly what you want:

from scipy.spatial import distance
distance.cdist(X, Y, 'sqeuclidean')

The docs linked above have some good examples.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
1

subtract lists, then square the list, then do sum.

import numpy as np
def get_sq_distances(a,b):
    return np.sum(np.square(np.subtract(a,b)))
print(get_sq_distances([5,7,9],[4,5,6]))