1

I'm writing a simple program to compute the euclidean distances between multiple lists using python. This is the code I have so fat

import math
euclidean = 0
euclidean_list = []
euclidean_list_complete = []

test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

for i in range(len(test2)):
    for j in range(len(test1)):
        for k in range(len(test1[0])):
            euclidean += pow((test2[i][k]-test1[j][k]),2)

        euclidean_list.append(math.sqrt(euclidean))
        euclidean = 0

    euclidean_list_complete.append(euclidean_list)


print euclidean_list_complete

my problem with this code is it doesn't print the output i want properly. The output should be [[80.0023, 173.018, 128.014], [72.006, 165.002, 120.000]]

but instead, it prints

[[80.00232559119766, 173.01843095173416, 128.01413984400315, 72.00680592832875, 165.0028407300917, 120.00041666594329], [80.00232559119766, 173.01843095173416, 128.01413984400315, 72.00680592832875, 165.0028407300917, 120.00041666594329]]

I'm guessing it has something to do with the loop. What should I do to fix it? By the way, I don't want to use numpy or scipy for studying purposes

If it's unclear, I want to calculate the distance between lists on test2 to each lists on test1

Iqbal Pratama
  • 139
  • 1
  • 5
  • 14
  • 1
    It's because `dist(a, b) = dist(b, a)`. The easiest way to remove the redundant computations is to loop over only half the items. – Mateen Ulhaq Jun 01 '18 at 06:44
  • What @MateenUlhaq says is correct. You can find these things by stepping through the code with a debugger, if you have one. Or by tracing all the steps by hand. It's labor-intensive but can really help you learn. Anyway, good luck with your studies! – S.L. Barth is on codidact.com Jun 01 '18 at 06:55
  • i'm trying to understand question, lets `test1 has [a,b,c]` and `test2 has [c,d]`, which points you are taking to calculate distance? – letmecheck Jun 01 '18 at 06:56
  • @S.L.Barth I tried to visualize it using a visualizer tool from a certain website, and I got it right until the 1st iteration of i. But then I realized the remaining values would also got in the euclidean_list list on the 2nd iteration. In that case, shouldn't it print [[80.00232559119766, 173.01843095173416, 128.01413984400315], [80.00232559119766, 173.01843095173416, 128.01413984400315, 72.00680592832875, 165.0028407300917, 120.00041666594329]] ? – Iqbal Pratama Jun 01 '18 at 07:01
  • @MohanBabu my bad, I should've written the question more precisely. Let test1 be [a, b, c] and test2 be [d, e]. I want to calculate the distance between d to a,b,c and e to a,b,c – Iqbal Pratama Jun 01 '18 at 07:04
  • @MateenUlhaq what exactly do you mean by looping over only half the items? I'm sorry I didn't understand it completely – Iqbal Pratama Jun 01 '18 at 07:07

4 Answers4

2

Not sure what you are trying to achieve for 3 vectors, but for two the code has to be much, much simplier:

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

def distance(list1, list2):
    """Distance between two vectors."""
    squares = [(p-q) ** 2 for p, q in zip(list1, list2)]
    return sum(squares) ** .5

d2 = distance(test2[0], test2[1])  

With numpy is even a shorter statement.

PS. python 3 recommened

Evgeny
  • 4,173
  • 2
  • 19
  • 39
2

The question has partly been answered by @Evgeny. The answer the OP posted to his own question is an example how to not write Python code. Here is a shorter, faster and more readable solution, given test1 and test2 are lists like in the question:

def euclidean(v1, v2):
    return sum((p-q)**2 for p, q in zip(v1, v2)) ** .5

d2 = []
for i in test2:
    foo = [euclidean(i, j) for j in test1]
    d2.append(foo)


print(d2)
#[[80.00232559119766, 173.01843095173416, 128.01413984400315],
# [72.00680592832875, 165.0028407300917, 120.00041666594329]]
MaxPowers
  • 5,235
  • 2
  • 44
  • 69
  • I do realize that my own code is not good which is why I said I'm doing it for studying purposes. But this answer is very good and very helpful. Thanks! – Iqbal Pratama Jun 01 '18 at 10:57
  • @MaxPowers - from your code I finally understand the intent of distances between two groups vectors, asked by OP – Evgeny Jun 01 '18 at 11:24
  • Once we are on a path for improvements, there can also list comp instead of loop for computing pair-wise listances ```def group_distance(vector_list1, vector_list2): return [[euclidean(v1, v2) for v2 in vector_list2] for v1 in vector_list1]``` – Evgeny Jun 01 '18 at 11:33
0
test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

final_list = []

for a in test2:
    temp = [] #temporary list
    for b in test1:
        dis = sum([pow(a[i] - b[i], 2) for i in range(len(a))])
        temp.append(round(pow(dis, 0.5),4))

    final_list.append(temp)
print(final_list)
letmecheck
  • 1,183
  • 8
  • 17
-2

I got it, the trick is to create the first euclidean list inside the first for loop, and then deleting the list after appending it to the complete euclidean list

import math
euclidean = 0

euclidean_list_complete = []

test1 = [[0.0, 0.0, 0.0, 152.0, 12.29], [0.0, 0.0, 0.357, 245.0, 10.4], [0.0, 0.0, 0.10, 200.0, 11.0]]

test2 = [[0.0, 0.0, 0.0, 72.0, 12.9], [0.0, 0.0, 0.0, 80.0, 11.3]]

for i in range(len(test2)):
    euclidean_list = []
    for j in range(len(test1)):
        for k in range(len(test1[0])):
            euclidean += pow((test2[i][k]-test1[j][k]),2)      
        euclidean_list.append(math.sqrt(euclidean))
        euclidean = 0
        euclidean_list.sort(reverse=True)
    euclidean_list_complete.append(euclidean_list)
    del euclidean_list

print euclidean_list_complete
Iqbal Pratama
  • 139
  • 1
  • 5
  • 14