1

I have 3 huge numpy arrays, and i want to build a function that computes the euclidean distance pairwise from the points of one array to the points of the second and third array.

For the sake of simplicity suppose i have these 3 arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

c = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

I have tried this:

def correlation(x, y, t):
    from math import sqrt

    for a,b, in zip(x,y,t):
        distance = sqrt((x[a]-x[b])**2 + (y[a]-y[b])**2 + (t[a]-t[b])**2 )
    return distance

But this code throws an error: ValueError: too many values to unpack (expected 2)

How can i correctly implement this function using numpy or base python?

Thanks in advance

Miguel 2488
  • 1,410
  • 1
  • 20
  • 41
  • 1
    Hi Miguel, you aware that even if the syntax error is eliminated, your distance will always be 0, since x[a]-x[a]=0, y[b]-y[b]=0 and t[c]-t[c]=0? So I suggest rewriting the funciton def. – zabop Feb 25 '19 at 10:22
  • Hi @zabop thank you for your suggestion, i edited the function, i think it makes more sense now – Miguel 2488 Feb 25 '19 at 10:25
  • Would you clarify what do you want to represent by x[a] and x[b]? The value inside the [] must be indices of the array x, but they are not in this current form of the function. – zabop Feb 25 '19 at 10:32
  • well, it's just the application of the euclidean distance formula. It should be `sqrt((x sub2 - xsub1)**2 + (ysub2-ysub1)**2)` – Miguel 2488 Feb 25 '19 at 11:17
  • And what is sub2 and sub1? – zabop Feb 25 '19 at 11:19
  • for a given point in one of this 3 matrices, i want to calculate the distance between the given point and the rest of points in the same matrix, and i have all the distances between for instance point 1 and points 1 to n, i want to take the next point, say point 2, and calculate the distances between point 2 and the n points of the matrix. I want to do this with the three matrices. i don't know if it's clearer now. you can ask me if you still have doubts – Miguel 2488 Feb 25 '19 at 11:31

2 Answers2

1

First we define a function which computes the distance between every pair of rows of two matrices.

def pairwise_distance(f, s, keepdims=False):
    return np.sqrt(np.sum((f-s)**2, axis=1, keepdims=keepdims))

Second we define a function which calculate all possible distances between every pair of rows of the same matrix:

def all_distances(c):
    res = np.empty(shape=c.shape, dtype=float)
    for row in np.arange(c.shape[0]):
        res[row, :] = pairweis_distance(c[row], c) #using numpy broadcasting
    return res

Now we are done

row_distances = all_distances(a) #row wise distances of the matrix a
column_distances = all_distances(a) #column wise distances of the same matrix
row_distances[0,2] #distance between first and third row
row_distances[1,3] #distance between second and fourth row
Redone R
  • 79
  • 2
  • Hi @Redone R, thank you for your answer!! What i want is to create a matrix that takes the all the points of the initial matrix as rows, and also as columns, and the values that would fill that matrix, are the distances between one point and another, element wise. This means i need to calculate all the distances between all the points. For instance: Distance between the first point and the second, between the first and the third, the first and the fourth, later the second with the first – Miguel 2488 Mar 01 '19 at 11:00
  • the second with the third, the second with the fourth, and so on till i have the distance between each point in the matrix to each other point in the matrix – Miguel 2488 Mar 01 '19 at 11:00
  • please take a look again – Redone R Mar 01 '19 at 20:53
0

Start with two arrays:

a = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

b = np.array([[1.64,0.001,1.56,0.1],
              [1.656,1.21,0.32,0.0001],
              [1.0002,0.0003,1.111,0.0003],
              [0.223,0.6665,1.2221,1.659]])

To calculate the distance between elements of these arrays you can do:

pairwise_dist_between_a_and_b=[(each**2+b[index]**2)**0.5 for index, each in enumerate(a)]

By doing so you get pairwise_dist_between_a_and_b:

[array([2.31931024e+00, 1.41421356e-03, 2.20617316e+00, 1.41421356e-01]),
 array([2.34193766e+00, 1.71119841e+00, 4.52548340e-01, 1.41421356e-04]),
 array([1.41449641e+00, 4.24264069e-04, 1.57119127e+00, 4.24264069e-04]),
 array([0.31536962, 0.94257334, 1.72831039, 2.3461803 ])]

You can use the same list comprehension for the first and third array.

zabop
  • 6,750
  • 3
  • 39
  • 84
  • Hi, and how can i achieve this using the 3 matrices?? do i have to calculate the distance from the resulting distance matrix to the 3rd one?? Thank you – Miguel 2488 Feb 25 '19 at 10:26