How to reduce execution time while evaluating each item in a list in python

Question

I have to build a distance matrix for locations

addresses = [(10.0, 20.0), (21.2318, 72.903), (26.4499, 80.3319), (20.0, 20.0), (19.114, 72.8927), (20.4189, 77.0153), (28.5377, 77.1217), (28.6201, 77.118), (28.5257, 77.2781)]

where distance in between two locations, is computed using the following function

def calculate_travel_distance(pointA,pointB):  
  return geodesic(pointA,pointB).meters

Distance Matrix is being build up using for loop as below

dist_matrix=[]
for pt1 in addresses:
  row_list =  [round(calculate_travel_distance(pt1,pt2),0) for pt2 in addresses]  
  dist_matrix.append(row_list)

When size of addresses grow up, it takes long time to execute i.e. 300 locations ( lat/lng pair) it takes 150 seconds to execute. Is it possible to bring down execution time to few secs(may be 10) for close to 400 locations. Please suggest.

Tomalak · Accepted Answer · 2020-09-30T12:24:38.497

Assuming you have a list of points A through J, then the matrix pairing all of those up looks like this:

AA AB AC AD AE AF AG AH AI AJ
BA BB BC BD BE BF BG BH BI BJ
CA CB CC CD CE CF CG CH CI CJ
DA DB DC DD DE DF DG DH DI DJ
EA EB EC ED EE EF EG EH EI EJ
FA FB FC FD FE FF FG FH FI FJ
GA GB GC GD GE GF GG GH GI GJ
HA HB HC HD HE HF HG HH HI HJ
IA IB IC ID IE IF IG IH II IJ
JA JB JC JD JE JF JG JH JI JJ

That's what your loop currently calculates. However, the distance AB and BA is equal, and the distances on the center line (AA, BB, ...) are always zero.

We can cut out workload in half (even less than half, from n^2 to n^2 / 2 - n) by calculating only the distances between points where x < y in the matrix.

   AB AC AD AE AF AG AH AI AJ
      BC BD BE BF BG BH BI BJ
         CD CE CF CG CH CI CJ
            DE DF DG DH DI DJ
               EF EG EH EI EJ
                  FG FH FI FJ
                     GH GI GJ
                        HI HJ
                           IJ

The blanks can be filled easily by mirroring the upper triangle. Staying within the example, this:

addresses = ['A','B','C','D','E','F','G','H','I','J']

distances = []

for x, a in enumerate(addresses):
    row = []
    distances.append(row)
    for y, b in enumerate(addresses):
        if x < y:
            row.append(a + b)               # actually calculate something
        elif x == y:
            row.append('--')                # that's always 0
        else:
            row.append(distances[y][x])     # we already calculated that

for row in distances:
    print(' '.join(row))

gives us this:

-- AB AC AD AE AF AG AH AI AJ
AB -- BC BD BE BF BG BH BI BJ
AC BC -- CD CE CF CG CH CI CJ
AD BD CD -- DE DF DG DH DI DJ
AE BE CE DE -- EF EG EH EI EJ
AF BF CF DF EF -- FG FH FI FJ
AG BG CG DG EG FG -- GH GI GJ
AH BH CH DH EH FH GH -- HI HJ
AI BI CI DI EI FI GI HI -- IJ
AJ BJ CJ DJ EJ FJ GJ HJ IJ --

The next step up in speed could be multi-threading, but maybe this optimization already is good enough for your use case.

A multi-threaded implementation of the above could look like this (it's probably not pythonic in more ways than one, but it does the job):

from multiprocessing import cpu_count
from multiprocessing.dummy import Pool as ThreadPool

# credit https://stackoverflow.com/a/54802737
def chunks(l, n):
    """Yield n number of striped chunks from l."""
    for i in range(0, n):
        yield l[i::n]

def calculate_travel_distance(a, b):
    return a + b

def calculate_distance_matrix(addresses):
    # prepare distance matrix, list of lists with n^2 slots
    distance_matrix = [['--' for a in addresses] for b in addresses]

    # the workload is the upper matrix triangle (where x < y)
    # since we're multi-threading, also remember the x,y position
    workload = [((a, b),(x, y)) for y, b in enumerate(addresses) for x, a in enumerate(addresses) if x < y]

    # worker function 
    def worker(chunk):
        return [(calculate_travel_distance(*points), matrix_pos) for points, matrix_pos in chunk]

    # distribute workload over available CPU cores
    pool = ThreadPool(cpu_count())
    result_chunks = pool.map(worker, chunks(workload, cpu_count()))

    # distribute result chunks into their slots
    for result_chunk in result_chunks:
        for result, matrix_pos in result_chunk:
            x, y = matrix_pos
            distance_matrix[x][y] = result
            distance_matrix[y][x] = result

    return distance_matrix

addresses = ['A','B','C','D','E','F','G','H','I','J']
distaince_matrix = calculate_distance_matrix(addresses)

for row in distaince_matrix:
    print(' '.join(row))

It prints the same thing:

-- AB AC AD AE AF AG AH AI AJ
AB -- BC BD BE BF BG BH BI BJ
AC BC -- CD CE CF CG CH CI CJ
AD BD CD -- DE DF DG DH DI DJ
AE BE CE DE -- EF EG EH EI EJ
AF BF CF DF EF -- FG FH FI FJ
AG BG CG DG EG FG -- GH GI GJ
AH BH CH DH EH FH GH -- HI HJ
AI BI CI DI EI FI GI HI -- IJ
AJ BJ CJ DJ EJ FJ GJ HJ IJ --

Thanks for ur suggestion, I agree we can reduce it to less than half here, it will not help if i compute distance using google distance api , there AB is not equal to BA since it is road distance, yes for crow(aerial) distance it is useful. — Shashank Trivedi, Sep 27 '20 at 09:12
@ShashankTrivedi Oh, you're off-loading the distance calculation to Google? That's significant info, it should have been in the question. I thought you were using something like `geopy.distance.geodesic()`. — Tomalak, Sep 27 '20 at 09:14
I am using geodesic, that is correct, I was saying AB and BA equal as long as it is geodesic. Here i am struggling only with the scale of the problem. Anyhow thanks for ur input. — Shashank Trivedi, Sep 27 '20 at 09:25
@ShashankTrivedi I'm implementing a multithreaded extension to this approach, hang on. — Tomalak, Sep 27 '20 at 09:34
@ShashankTrivedi Check the updated answer. This might provide another speed boost. Please measure and post how it compares with the other methods on your input data. — Tomalak, Sep 27 '20 at 10:13
thank u very much, indeed it has significantly reduced execution time close to 80% for 300 locations. — Shashank Trivedi, Sep 27 '20 at 12:15
@ShashankTrivedi So that means now it runs in 20% of the original time? That's not bad! — Tomalak, Sep 27 '20 at 12:35

How to reduce execution time while evaluating each item in a list in python

1 Answers1