I'm looking for a way to obtain a new columns reporting the minimale distance (km) under condition.
It will be more clear with an example :
Ser_Numb LAT LONG VALUE MIN
1 74.166061 30.512811 1
2 72.249672 33.427724 1
3 67.499828 37.937264 0
4 84.253715 69.328767 1
5 72.104828 33.823462 0
6 63.989462 51.918173 0
7 80.209112 33.530778 0
8 68.954132 35.981256 1
9 83.378214 40.619652 1
10 68.778571 6.607066 0
So when value=0
, I have to find the closest other city (latitude/longitude) to compute the distance to this city who presents a VALUE=1
.
With this stack we can have the formula, but how can I adapt it to take the minimal distance ?
from math import radians, cos, sin, asin, sqrt
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
# Radius of earth in kilometers is 6371
km = 6371* c
return km
EDIT Here is what I try:
df['dist_VALUE']=0
for i in range(len(df[df['VALUE']<1])):
for j in range(len(df[df['VALUE']>0])):
(df[df['VALUE']<1].reset_index(drop=True).loc[i,'dist_VALUE'] =
min(haversine(df[df['VALUE']<1].reset_index(drop=True).loc[I,'LONG'],
df[df['VALUE']<1].reset_index(drop=True).loc[i,'LAT'],
df[df['VALUE']>0].reset_index(drop=True).loc[j,'LONG'],
df[df['VALUE']>0].reset_index(drop=True).loc[j,'LAT'])))
VALUE
is integer and LAT
or LONG
are float.