Slow Euclidean Distance

Question

I am calculating Euclidean Distance with python code below:

def getNeighbors(trainingSet, testInstance, k, labels):
    distances = []
    for x in range(len(trainingSet)):
        dist = math.sqrt(((testInstance[0] - trainingSet[x][0]) ** 2) +    ((testInstance[1] - trainingSet[x][1]) ** 2))
        distances.append([dist, labels[x]])
    distances = np.array(distances)   
    return distances

For calculating distance of a given point with 10 other points, it's good. But when I calculate distance of a point with 18563 other points, then the computer gets hanged and don't response for around 3 Hrs.

How can I make calculation for 18563 points faster?

you may use `math.hypot(x, y)` instead of `dist = math.sqrt(((testInstance[0] - trainingSet[x][0]) ** 2) + ((testInstance[1] - trainingSet[x][1]) ** 2))`, and `xrange`instead of `range` — Jose Ricardo Bustos M., May 05 '15 at 19:36
Are you doing anything that might require pairwise distances, or are you only calculating distances to one distinguished point? Pairwise distances for 18563 points are going to take a lot longer, simply because there are about 172 million of them to calculate. — user2357112, May 05 '15 at 19:38
I am taking one example and calculating it's distances with other points. — user2831683, May 05 '15 at 19:39
Don't use for loops to do heavy lifting (in python). Either vectorize so you can numpy or use something different like cython or numba. — syntonym, May 05 '15 at 19:46
Can you give a [complete minimal example](http://stackoverflow.com/help/mcve) instead of just this function on its own? Because if you're just calling this once, even with a list of 18563 points, it really shouldn't take long at all. It's not clear what you _are_ doing that's taking 3 hours, but the little 5% optimizations that people can make to just this code in isolation aren't going to help (except to reduce it to 2 hours and 51 minutes). — abarnert, May 05 '15 at 19:50
Even if converting the loop to NumPy makes it, say, 12x as fast, it's still taking 15 minutes instead of the fraction of a second it should take… — abarnert, May 05 '15 at 19:51
One more thing: Any chance that your vmsize is larger than your system's physical RAM? If so, every time you iterate a large object, or just access something you haven't accessed in a while, you're probably swapping memory pages to and from disk, which will slow the whole program (and your whole computer) down by a few orders of magnitude. If that's the problem, the best way to speed it up is (somewhat ironically) to focus on reducing space instead of time. — abarnert, May 05 '15 at 20:17
And one more one-more-thing: Have you tried just running your existing code in PyPy instead of CPython? That tends to get you the same kind of order-of-magnitude constant speedup as NumPy (and occasionally it'll get you a 1000000x speedup, which tells you that all the heavy lifting you're doing in some loop can be lifted out of that loop…), and changing the `#!` line is even less work than rewriting your loops as elementwise operators. — abarnert, May 05 '15 at 20:20

abarnert · Accepted Answer · 2015-05-05T20:32:34.903

6

You can speed this up by converting to NumPy first, then using vector operations, instead of doing the work in a loop, then converting to NumPy. Something like this:

trainingArray = np.array(trainingSet)
distances = ((testInstance[0] - trainingArray[:, 0]) ** 2 +
             (testInstance[1] - trainingArray[:, 1]) ** 2).sqrt()

(That's obviously untested, because without enough context to know what's actually in those variables I had to guess, but it will be something close to that.)

There are other things you can do to squeeze out a few extra %—try replacing the ** 2 with self-multiplication or the sqrt with ** .5, or (probably best) replacing the whole thing with np.hypot. (If you don't know how to use timeit—or, even better, IPython and it's %timeit magic—now is a great time to learn.)

But ultimately, this is just going to give you a constant-multiplier speedup of about an order of magnitude. Maybe it takes 15 minutes instead of 3 hours. That's nice, but… why is it taking 3 hours in the first place? What you're doing here should be taking on the order of seconds, or even less. There's clearly something much bigger wrong here, like maybe you're calling this function N**2 times when you think you're only calling it once. And you really need to fix that part.

Of course it's still worth doing this. First, element-wise operations are simpler and more readable than loops, and harder to get wrong. Second, even if you reduce the whole program to 3.8 seconds, you'll be happy for the order-of-magnitude speedup to 0.38 seconds, right?

edited May 05 '15 at 20:32

answered May 05 '15 at 19:55

abarnert

354,177
51
601
671

what is `trainingArray` for ? – user2831683 May 05 '15 at 20:09
@user2831683: It's a NumPy `array` with the same elements as are in the `list` or `set` or whatever in `trainingSet`. By making it a NumPy `array` instead of a native Python collection, we can do NumPy slicing—and, more importantly, NumPy vectorized operations. – abarnert May 05 '15 at 20:11
1

I meant, you aren't using it. – user2831683 May 05 '15 at 20:13
is this faster than using np.hypot()? – paddyg May 05 '15 at 20:23
@paddyg: Probably not, but it's probably a tiny difference, and it should be a lot easier for the OP to understand how this is equivalent to his code. I've edited it to mention `np.hypot`, but I don't think it's that important here. – abarnert May 05 '15 at 20:30
2

@abarnert in answer to my own question, for 18563 points: `** 0.5` and `np.sqrt()` take 0.77ms `np.hypot()` and self multiplication (instead of `** 2`) take 1.14ms. Obviously your other points are the critical step here (both these values are effectively instantaneous c.f. 3h) but interesting to know when squeezing the last bit of speed out. – paddyg May 05 '15 at 22:21
@paddyg: Wait, your optimizations actually make it 48% _slower_? I wouldn't have expected that. Well, that's why you test. :) – abarnert May 05 '15 at 22:25
@paddyg: Anyway, I still think there's something _else_ wrong here outside the code he's shown us. From a quick&dirty test with a made-up guess at the data, his original code took only 38ms, which may be 1 or 2 orders of magnitude slower than with NumPy, but it's _still_ effectively instantaneous, and certainly not hours… – abarnert May 05 '15 at 22:28
finally, calculation gets completed in around 11 secs. I used this and replaced loops with numpy equivalents and also used `** 2` and `** 0.5` instead of `math.pow` and `math.sqrt`. :) – user2831683 May 06 '15 at 14:26

eroot163pi · Answer 2 · 2021-09-18T11:33:39.130

Writing own sqrt calculator has its own risks, hypot is very safe for resistance to overflow and underflow

x_y = np.array(trainingSet)
x = x_y[0]
y = x_y[1]
distances = np.hypot(
    np.subtract.outer(x, x),
    np.subtract.outer(y, y)
)

Speed wise they are same

%%timeit
np.hypot(i, j)
# 1.29 µs ± 13.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%%timeit
np.sqrt(i**2+j**2)
# 1.3 µs ± 9.87 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Underflow

i, j = 1e-200, 1e-200
np.sqrt(i**2+j**2)
# 0.0

Overflow

i, j = 1e+200, 1e+200
np.sqrt(i**2+j**2)
# inf

No Underflow

i, j = 1e-200, 1e-200
np.hypot(i, j)
# 1.414213562373095e-200

No Overflow

i, j = 1e+200, 1e+200
np.hypot(i, j)
# 1.414213562373095e+200

Refer

Slow Euclidean Distance

2 Answers2

Underflow

Overflow

No Underflow

No Overflow