I have a Python performance issue. I need to merge two dataframes on an openstreetmaps function, and it is too slow. My first dataframe df1 is 35K rows long, and my second one 2K rows long. Full merge is then 70M. I suppose I won't be able to improve openstreetmaps response time but maybe my code could be faster. Can you see a way of improvments?
Thanks for answers.
import requests, json
from sqlalchemy import create_engine
url = 'http://router.project-osrm.org/route/v1/driving/'
conn = 'postgresql+psycopg2://postgres:{0}@localhost/postgres'.format('')
my_conn = create_engine(conn)
def closest(s,df2):
distance_min = 1000000
duration_min = 1000000
place_min = ''
for i in range(len(df2)):
x = str(s['LONG']) +','+ str(s['LAT']) +";"+str(df2.loc[i, "LONG"])+","+str(df2.loc[i, "LAT"])
response = requests.get(url+x)
data = json.loads(response.content)
duration = int((data['routes'][0]['duration']*0.95)/60)
if duration_min > duration:
distance_min = int(data['routes'][0]['distance']*0.000621371*1.609344)
duration_min = duration
place_min = df2.loc[i, "PLACE_NAME"]
s['CLOSEST_PLACE'] = place_min
s['DISTANCE'] = distance_min
s['DURATION'] = duration_min
return s
df1 = df1.apply(lambda x: closest(x, df2),axis=1)
Edit:
Head of df1 & df2 - it is French open data: basically I want to find the closest station (df2) to each city (df1)
ENT_CODE COM_NAME COM_CODE DEP_NAME DEP_CODE REG_NAME REG_CODE LAT LONG
0 01001 L'Abergement-Clémenciat 1 Ain 01 Auvergne-Rhône-Alpes 84 46.153426 4.926114
1 01002 L'Abergement-de-Varey 2 Ain 01 Auvergne-Rhône-Alpes 84 46.009188 5.428017
2 01004 Ambérieu-en-Bugey 4 Ain 01 Auvergne-Rhône-Alpes 84 45.960848 5.372926
3 01005 Ambérieux-en-Dombes 5 Ain 01 Auvergne-Rhône-Alpes 84 45.996180 4.912273
4 01006 Ambléon 6 Ain 01 Auvergne-Rhône-Alpes 84 45.749499 5.594320
5 01007 Ambronay 7 Ain 01 Auvergne-Rhône-Alpes 84 46.005591 5.357607
6 01008 Ambutrix 8 Ain 01 Auvergne-Rhône-Alpes 84 45.936713 5.332809
7 01009 Andert-et-Condon 9 Ain 01 Auvergne-Rhône-Alpes 84 45.787357 5.657883
8 01010 Anglefort 10 Ain 01 Auvergne-Rhône-Alpes 84 45.909372 5.795160
9 01011 Apremont 11 Ain 01 Auvergne-Rhône-Alpes 84 46.205498 5.657815
PLACE_NAME LAT LONG
0 Aéroport Charles de Gaulle 2 TGV 49.003652 2.570892
1 Agde 43.317280 3.466203
2 Agen 44.208311 0.620932
3 Aime - La Plagne 45.554400 6.648646
4 Aix-en-Provence TGV 43.455237 5.317534
5 Aix-les-Bains le Revard 45.688161 5.909371
6 Albertville 45.672977 6.383167
7 Amiens 49.890746 2.312592
8 Ancenis 47.369334 -1.177763
9 Angers Saint-Laud 47.464647 -0.556820