Adapted from this answer. The linked answer shows how to compute distance between each row and some fixed value for longitude/latitude - my adaptation allows it to work for your case.
First, get all the values you need on the same row using shift
:
df['lon2'] = df['lon'].shift(-1)
df['lat2'] = df['lat'].shift(-1)
Giving:
id lat lon lat2 lon2
0 1 NaN NaN 40.121 23.749
1 1 40.121 23.749 -56.154 -39.572
2 1 -56.154 -39.572 21.908 17.537
3 1 21.908 17.537 31.221 -36.186
4 1 31.221 -36.186 -56.655 0.016
5 1 -56.655 0.016 NaN NaN
6 2 NaN NaN -36.438 14.874
7 2 -36.438 14.874 -21.422 81.271
8 2 -21.422 81.271 43.961 -95.551
9 2 43.961 -95.551 NaN NaN
10 3 NaN NaN 79.821 -56.781
11 3 79.821 -56.781 NaN NaN
Then define function for distance calculation:
from numpy import cos, sin, arcsin, sqrt
from math import radians
def haversine(row):
lon1 = row['lon']
lat1 = row['lat']
lon2 = row['lon2']
lat2 = row['lat2']
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * arcsin(sqrt(a))
km = 6367 * c
return km
And apply it to your data using apply
:
df['distance'] = df.apply(haversine, axis=1)
Giving:
id lat lon lat2 lon2 distance
0 1 NaN NaN 40.121 23.749 NaN
1 1 40.121 23.749 -56.154 -39.572 12237.017692
2 1 -56.154 -39.572 21.908 17.537 10187.684397
3 1 21.908 17.537 31.221 -36.186 5387.540299
4 1 31.221 -36.186 -56.655 0.016 10343.267833
5 1 -56.655 0.016 NaN NaN NaN
6 2 NaN NaN -36.438 14.874 NaN
7 2 -36.438 14.874 -21.422 81.271 6543.302199
8 2 -21.422 81.271 43.961 -95.551 17480.809345
9 2 43.961 -95.551 NaN NaN NaN
10 3 NaN NaN 79.821 -56.781 NaN
11 3 79.821 -56.781 NaN NaN NaN
Which I believe shows the results you are looking for (I tested the first one and it seems to be correct).
If you like, you can get rid of the two secondary lat/lon columns once the calculations are complete:
df.drop(['lat2', 'lon2'], axis=1, inplace=True)
I should note that this solution will not give you the fastest possible calculations. See the second half of the answer I linked for an exploration of how this can be improved if performance is top priority here, although it will need to be adapted.