I have a dataframe that contains the origin destination trips of users between different points (latitude/longitude). So we have Origin_X, Origin_Y
and Destination_X, Destination_Y
df:
Trip Origin_X Origin_Y Destination_X Destination_Y
1 -33.55682 -70.78614 -33.44007 -70.6552
2 -33.49097 -70.77741 -33.48908 -70.76263
3 -33.37108 -70.6711 -33.73425 -70.76278
I want to group together all the Trip
that have are in a radius of 1km
both at the origin and destination. Two trips can be grouped if the their distance ad the origin and their distance at destination is d<=1km
. In order to compute the distance between two coordinates I am using the haversine
function.
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r