1

I am working on Minkowski Distances, which is defined by:

enter image description here

I use a for loop to calculate it as follow,

import numpy as np
import random
A = np.random.randint(5, size=(10, 5))
B = [1, 3, 5, 2, 4]
for i in range(10):
    dist = (sum((abs(A[i]-B))**5))**(1/5) # I set p=5 in this case
    print("Distances: ", dist)

Is there any way I can avoid this loop using numpy techniques?

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Mass17
  • 1,555
  • 2
  • 14
  • 29
  • 2
    You should use `scipy.spatial.distance.cdist`. It supports Minkowski metric out of the box. – Andras Deak -- Слава Україні Oct 30 '18 at 14:13
  • Possible duplicate of [Efficient distance calculation between N points and a reference in numpy/scipy](https://stackoverflow.com/questions/6430091/efficient-distance-calculation-between-n-points-and-a-reference-in-numpy-scipy) – Georgy Oct 30 '18 at 14:25

1 Answers1

4

You can use broadcasting:

import numpy as np
np.random.seed(42)

A = np.random.randint(5, size=(10, 5))
B = [1, 3, 5, 2, 4]


result = (np.abs(A - B)**5).sum(axis=1)**(1/5)
print(result)


for i in range(10):
    dist = (sum((abs(A[i]-B))**5))**(1/5) # I set p=5 in this case
    print("Distances: ", dist)

OUtput

[3.14564815 3.00246508 2.04767251 2.02439746 4.04953891 4.00312013
 2.49663093 3.49301675 3.53370523 2.04767251]
Distances:  3.1456481457393184
Distances:  3.0024650813881837
Distances:  2.0476725110792193
Distances:  2.024397458499885
Distances:  4.049538907295691
Distances:  4.003120128600393
Distances:  2.496630931732087
Distances:  3.4930167541811468
Distances:  3.5337052340491883
Distances:  2.0476725110792193
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76