I have a two pandas data frames, each containing a column with a tuple of coordinates (one for schools, the other for houses).
I would like to create a new column in the houses dataframe containing the shortest distance from any of the schools. Below is some code I tried but without success
import pandas as pd
import numpy as np
import geopy.distance
schools = pd.DataFrame([(29.775803, -95.56353), (40.060276, -83.004196), (40.70592, -74.010765)])
houses = pd.DataFrame([(41.291989997, -73.087632372), (41.16741635, -73.188437585), (41.038689564, -73.635282641), (40.825542, -96.60775)])
x = 0
minimum_distance = []
for i in t:
for j in private:
if geopy.distance.geodesic(i, j).km > x:
v = geopy.distance.geodesic(i, j).km
minimum_distance.append(v).km
else:
continue
schools['shortest_distance'] = minimum_distance
The house dataframe should look like this after:
0 1 shortest_distance
0 41.291990 -73.087632 101.332983
1 41.167416 -73.188438 86.153595
2 41.038690 -73.635283 48.656830
3 40.825542 -96.607750 1156.075739
Does anyone have any idea how to perform this ? I used a double loop in my code because I don't think there is another way since each element has to be searched but I am also wondering how efficient it would be with 2 dataframe of 20000 rows each.
Thank you in advance for your help !
Louis