How can I improve this code with nested loops?

Question

I have a function that count number of collisions between two point in each frame. I have no idea how to improve this very slow code.

     #data example
     #[[89, 814, -77.1699249744415, 373.870468139648, 0.0], [71, 814, -119.887828826904, 340.433287620544, 0.0]...]

    def is_collide(data, req_dist):
        #req_dist - minimum distance when collision will be count

        temp = data 

        temp.sort(key=Measurements.sort_by_frame)
        max_frame = data[-1][1]
        min_frame = data[0][1]
        collissions = 0

        # max_frame-min_frame approximately 60000
        # the slowest part
        for i in range(min_frame, max_frame):
            frames = [line for line in temp if line[1] == i]
            temp = [line for line in temp if line[1] != i]
            l = len(frames)

            for j in range(0, l, 1):
                for k in range(j+1, l, 1):
                    dist = ((frames[j][2] - frames[k][2])**2 + (frames[j][3]-frames[k][3])**2)**0.5
                    if dist < req_dist:
                        collissions += 1

        return collissions

Its error free and you only want to increase the performance right? — , Oct 09 '15 at 21:08
Can you provide a minimum verifiable example with short examples of data, expected input and output? — Jean Jung, Oct 09 '15 at 21:15
[code review](http://codereview.stackexchange.com/) may be a more appropriate place where to ask — Pynchia, Oct 09 '15 at 21:25
avoid calculating the square root within the loop by checking the square of `dist`: `def is_collide(data, req_dist):` `req_dist **= 2` ... `dist = (frames[j][2] - frames[k][2])**2 + (frames[j][3]-frames[k][3])**2` — Pynchia, Oct 09 '15 at 21:36
i add data example and comment beginning of the slowest part — Andrey Khristoforov, Oct 09 '15 at 21:36
if you are replying to my suggestion, read it again. It's useless to keep calculating the square within the loop. Use the square of `req_dist` and compare to that. — Pynchia, Oct 09 '15 at 21:42
why use two lists comps instead of a single loop using if/else and appending to appropriate lists? — Padraic Cunningham, Oct 09 '15 at 21:46
Also bisecting would probably be an idea as your data is sorted — Padraic Cunningham, Oct 09 '15 at 21:48
Pynchia, I`ve done as you said but they are approximately equal or improvment is not significant. (running time: ~180 sec) — Andrey Khristoforov, Oct 09 '15 at 22:10

score 0 · Answer 1 · answered Oct 09 '15 at 22:16

Computing distance between every pair of points is expensive: an O(n**2) operation. In general, that can be very expensive even for small n.

I would suggest stepping back and seeing if there is a better data structure to do this::

Quad-trees: Check the wikipedia article on Quad-Trees. These can be used for collision detection possibly. https://en.wikipedia.org/wiki/Quadtree
In Jon Bentley's book "Programming Pearls", Section 2, column 5 is very relevant to this. He describes all the optimizations needed for computing something similar in a N-body problem. I strongly suggest reading that for some ideas.

Having said that, I think there are some places where you could make some fairly simply improvements and get some modest speed-up.

1) The distance computation with an exponentiation (actually the square root) is an expensive operation.
2) You use n**2 to compute a square, when it's probably faster to just multiply n by itself.

You could replace it with a temp (and multiply by itself), but even better: you don't need it! As long as all distances are computed the same way (without the **.5), you can compare them. In other words, distances can be compared without the sqrt operation, as long as you only need the relative value. I answered a similar question here:

Fastest way to calculate Euclidean distance in c

Hope this helps!

How can I improve this code with nested loops?

1 Answers1