8

I am currently using SciPy to calculate the euclidean distance

dis = scipy.spatial.distance.euclidean(A,B)

where; A, B are 5-dimension bit vectors. It works fine now, but if I add weights for each dimension then, is it still possible to use scipy?

What I have now: sqrt((a1-b1)^2 + (a2-b2)^2 +...+ (a5-b5)^2)

What I want: sqrt(w1(a1-b1)^2 + w2(a2-b2)^2 +...+ w5(a5-b5)^2) using scipy or numpy or any other efficient way to do this.

Thanks

Maggie
  • 5,923
  • 8
  • 41
  • 56

4 Answers4

11

The suggestion of writing your own weighted L2 norm is a good one, but the calculation provided in this answer is incorrect. If the intention is to calculate

enter image description here

then this should do the job:

def weightedL2(a,b,w):
    q = a-b
    return np.sqrt((w*q*q).sum())
Community
  • 1
  • 1
talonmies
  • 70,661
  • 34
  • 192
  • 269
1

If you want to keep using scipy function you could pre-process the vector like this.

def weighted_euclidean(a, b, w):
    A = a*np.sqrt(w)
    B = b*np.sqrt(w)
    return scipy.spatial.distance.euclidean(A, B)

However it's look slower than

def weightedL2(a, b, w):
    q = a-b
    return np.sqrt((w*q*q).sum())
ucsky
  • 442
  • 6
  • 13
1

Simply define it yourself. Something like this should do the trick:

def mynorm(A, B, w):
    import numpy as np
    q = np.matrix(w * (A - B))
    return np.sqrt((q * q.T).sum())
wim
  • 338,267
  • 99
  • 616
  • 750
  • 2
    That isn't the norm contained in the question - you have squared the weights. Also the `.sum()` is completely redundant, `q*q.T` is the inner product of the vector with itself, ie. it *is* the sum. – talonmies Jan 14 '12 at 12:05
  • You are correct about the weights, I should have been more careful, however your criticism about the `.sum()` being completely redundant is misguided. The result of `q * q.T` would be a 1x1 matrix, which would be an unexpected return type for a norm function, the sum will turn it into a scalar. – wim Jan 14 '12 at 14:02
  • But why use `sum()` to cast to a scalar? `np.asscalar` will be several times faster`? – talonmies Jan 14 '12 at 14:14
  • I don't know the reason, but that is how it is implemented in `scipy.spatial.distance.euclidean` .. I just assume the authors of scipy know what's best – wim Jan 14 '12 at 14:52
0

The present version of scipy (v1.9.3 as of writing) supports weighted L2 distance. From scipy.spatial.distance.euclidean

enter image description here

where: w(N,) array_like, optional

The weights for each value in u and v. Default is None, which gives each value a weight of 1.0

user115625
  • 161
  • 3