I have a workaround that I have been using because I had the same issue for the Swedish transportation system in Stockholm. It is uggly but it works quite well. Might be useful. I make a copy of my original data:
import pandas as pd
import numpy as np
import sklearn.neighbors
locations_A = pd.DataFrame({
'Stopp_A' : ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
'latitude_A': [ 56.75,56.19,56.08,51.28,52.36,51.29,51.87,52.61],
'longitude_A': [18.39,18.82, 18.65,18.74,18.06,18.61,18.27,18.20]
})
locations_B = pd.DataFrame({
'Stopp_B' : ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
'latitude_B': [ 56.75,56.19,56.08,51.28,52.36,51.29,51.87,52.61],
'longitude_B': [18.39,18.82,18.65,18.74,18.06,18.61,18.27,18.20]
})
As you see, I change the location name from Stopp_A
to Stopp_B
in the copy.
After this, I compute radians and create a distance matrix:
locations_A[['lat_radians_A','long_radians_A']] = (
np.radians(locations_A.loc[:,['latitude_A','longitude_A']])
)
locations_B[['lat_radians_B','long_radians_B']] = (
np.radians(locations_B.loc[:,['latitude_B','longitude_B']])
)
dist = sklearn.neighbors.DistanceMetric.get_metric('haversine')
dist_matrix = (dist.pairwise
(locations_A[['lat_radians_A','long_radians_A']],
locations_B[['lat_radians_B','long_radians_B']])*6371 #Radius in kilometer
)
df_dist_matrix = (
pd.DataFrame(dist_matrix,index=locations_A['Stopp_A'],
columns=locations_B['Stopp_B'])
)
df_dist = (
pd.melt(df_dist_matrix.reset_index(),id_vars='Stopp_A')
)
df_dist = df_dist_long.rename(columns={'value':'Kilometers'})
which returns:
Stopp_A Stopp_B Kilometers
0 A A 0.000000
1 B A 2088.626114
2 C A 2043.060585
3 D A 950.191543
4 E A 1506.375876
.. ... ... ...
59 D H 3051.681403
60 E H 3990.191284
61 F H 3737.181244
62 G H 1083.053543
63 H H 0.000000
This method reduced my computation time significantly.