I'm working on a vehicle routing problem and by far the most time consuming part is generating a distance/time matrix using real road networks. For this project I use a localhost graphhopper server and I used to use to get pair-wise distances and times between locations; everything was straightforward to parse. It returns individual driving instructions and also a total journey distance and time. But, for 2000 locations, this would be nearly 4mil calls.
On a hunch I wondered if it would be faster if I asked it to instead plan a route that went via 100 locations at a time. The aim then would be to break the json response down into 100 blocks as though I'd made 100 pair-wise calls. Turns out it's nearly 3 times faster for 100 locations. Apart from a little headache of trying to keep track of what parts of my matrix are built, the response gives no obvious structural difference for each sub-route, but instead just gives the instruction that you are at a "Stopover" e.g.
{
"hints": {
"visited_nodes.average": "243.0",
"visited_nodes.sum": "3402"
},
"paths": [{
"instructions": [{
"distance": 315.637,
"sign": 0,
"interval": [0, 7],
"text": "Continue",
"time": 29848
}, {
"exit_number": 3,
"distance": 1460.234,
"sign": 6,
"turn_angle": -0.01,
"interval": [7, 27],
"text": "At roundabout, take exit 3",
"time": 96167
}, {
"distance": 1258.763,
"sign": 0,
"interval": [27, 30],
"text": "Continue onto Sheikh Zayed bin Sultan Street",
"time": 50350
}, {
"distance": 116.7,
"sign": 0,
"interval": [30, 31],
"text": "Continue",
"time": 4668
}, {
"distance": 3543.556,
"sign": 0,
"interval": [31, 57],
"text": "Continue onto Sheikh Zayed bin Sultan Street",
"time": 144812
}, {
"distance": 0,
"sign": 5,
"interval": [87, 88],
"text": "Stopover 1",
"time": 0
},
My current working prototype is to keep summating the individual distances (meters) for each instruction but use regex to look for "Stopover" before starting afresh recording the distances.
distance_matrix = []
sub_distance = []
for unit in response['paths'][0]['instructions']:
if bool(re.search('Stopover', unit['text'])) == False:
sub_distance.append(float(unit['distance'])/1000)
else:
distance_matrix.append(sum(sub_distance))
sub_distance = []
My background is not in programming at all and I'm self-taught in Python for less than a year. Whilst this works, I'm wondering if I'm missing a more natural (and faster) way to do this but I can't find anything related via Google because I'm not quite sure what I'm searching for. At our current scale the distance/time matrix can already take ~8 mins to compute so I'm open to any suggestions that shave off time. There appears to be a limit to the number of points in a route so this function will still be called many thousands of times so it will add up. Thanks in advance.