I need assistance in a faster distance matrix tool for approximately 107000 source points and 220 destination points.
I currently use Open route services on docker, it generates memory error if I run the whole thing, so I run it in a loop i.e. dividing 107000 source points in each loop. This will take more than 2 hours when I am calculating distance.
Same process takes 6 minutes in R when I run the script.
Why is that? How can I make the python one faster? Are there any alternatives?
#setting the index to find the row for where paddocks start
paddockidx = site_and_point_coords.loc[site_and_point_coords["site_or_paddock"] == "paddock"].index[0]
#generating all the coords for both sites and paddocks
coordinates = list(zip(site_and_point_coords.lon.values, site_and_point_coords.lat.values))
# generating destinations i.e. site locations for route_matrix
destinations = [i for i in range(paddockidx)]
#generating sources i.e. paddocks for route_matrix
sources = [i for i in range(paddockidx, len(coordinates))]
# running the distance_matrix, client is defined in the import cell / chunk
# client is connected to local docker, i.e. i am sending request to local docker container and not online
coordinates = coordinates
sources = sources
destinations = destinations
matrix = client.distance_matrix(
locations= coordinates,
sources = sources[:len(sources)//20],
destinations= destinations,
profile='driving-car',
metrics=['distance'],
units = 'km',
validate = True)