I have a dataframe df that looks like that :
id id_latlong
1 (46.1988400;5.209562)
2 (46.1988400;5.209562)
3 (46.1988400;5.209562)
4 (46.1988400;5.209562)
5 (46.438805;5.11890299)
6 (46.222993;5.21707600)
7 (46.195183;5.212575)
8 (46.195183;5.212575)
9 (46.195183;5.212575)
10 (48.917459;2.570821)
11 (48.917459;2.570821)
Every row is a location and the data in the column "id_latlong" are coordinates.
I want to select the id of every location that is at less than 800 meters from a defined location :
defined_location_latlong = "(46.1988400;5.209562)"
I have a function that calcule the distance, in meters, between two coordinates:
def distance_btw_coordinates (id_latlong1, id_latlong2) :
try :
R = 6372800 # Earth radius in meters
lat1 = float(id_latlong1.partition('(')[2].partition(';')[0])
lon1 = float(id_latlong1.partition(';')[2].partition(')')[0])
lat2 = float(id_latlong2.partition('(')[2].partition(';')[0])
lon2 = float(id_latlong2.partition(';')[2].partition(')')[0])
phi1, phi2 = math.radians(lat1), math.radians(lat2)
dphi = math.radians(lat2 - lat1)
dlambda = math.radians(lon2 - lon1)
a = math.sin(dphi/2)**2 + \
math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
distance = 2*R*math.atan2(math.sqrt(a), math.sqrt(1 - a))
except :
distance = 1000000000
return distance
In order to select every row that is at less than 800 meters from the defined location, I tried :
df.loc[distance_btw_cohordonates(df['id_latlong'], defined_location_latlong ) < 800]
But it doesn't work :
KeyError: False
It doesn't work because the function takes all the data in the column "id_latlong" at once...
Do you know how I could do this without having to iterate ?
Thank you !
EDIT : I have 500k different defined locations, I would prefer not having to stock the distance between every row in df and every defined location... Is it possible to select every location that is at less than 800 meters without having to stock the distances ?