perform haversine function on four columns into new columns

Question

    ID        st_lat    st_lng        end_lat   end_lng
0   4       127.035740  37.493954   127.035740  37.493954
1   4       127.035740  37.493954   127.035740  37.493954
2   5       127.034870  37.485865   127.034318  37.485645
3   5       127.034201  37.485598   127.035064  37.485949
4   5       127.035064  37.485949   127.034618  37.485938

my dataframe looks like above. I am trying to create new column by applying haversine function which require two tuples. ex: haversine( (lat, lng), (lat, lng) ) returns distance between two points.

Their datatypes are in float. following https://www.geeksforgeeks.org/create-a-new-column-in-pandas-dataframe-based-on-the-existing-columns/ I've done

df["distance(km)"] = df.apply(lambda row:haversine((row.st_lat, row.st_lng), (row.end_lat, row.end_lng)))

which returns

AttributeError: ("'Series' object has no attribute 'st_lat'", 'occurred at index user_id')

and

df["distance(km)"] = haversine((df.st_lat, df.st_lng), (df.end_lat, df.end_lng))

returning TypeError: cannot convert the series to float.

I know it is because df.st_lat gives series and cannot input two series and create a tuple.

for each st_lat, st_lng pair I want to compare it with end_lat, end_lng pair and create a column that contain distances.

Any help? I've looked at how to split column of tuples in pandas dataframe?

Split Column containing 2 values into different column in pandas df

which is opposite of what I am trying to do.

EDIT: solved by using

   def dist(df):
    return haversine(df["start"], df["end"])

   df["distance(km)"] = df.apply(dist, axis =1)

score 2 · Answer 1 · answered Sep 30 '19 at 04:01

You can use vectorized numpy version of haversine function link

df["distance(km)"] = haversine_np(df.st_lat, df.st_lng, df.end_lat, df.end_lng)


df

   ID      st_lat     st_lng     end_lat    end_lng  distance(km)
0   4  127.035740  37.493954  127.035740  37.493954  0.000000
1   4  127.035740  37.493954  127.035740  37.493954  0.000000
2   5  127.034870  37.485865  127.034318  37.485645  0.063084
3   5  127.034201  37.485598  127.035064  37.485949  0.098737
4   5  127.035064  37.485949  127.034618  37.485938  0.049567

perform haversine function on four columns into new columns

1 Answers1