0

I have the data frame that have 4 columns two first have coordinations of first point x1, y1 and two second have coordinations of second one x1, y2. I need to calculate distance between these points and fill it into other column. To calculate distance I use geopy.distance.geodesic and I need to do it fast so I don't want to do it in cycle. Can I do something like that in pandas? df['distance'] = df['x1', 'y1', 'x2', 'y2'].map(lambda x,y,z,w = geodesic((x, y), (z, w)))

  • Pass the relevant series directly to a Haversine function: https://stackoverflow.com/a/51722117/2741091. Otherwise apply your `geopy` function. – ifly6 May 31 '23 at 13:05
  • your line of code should work. Did you try it??? – gtomer May 31 '23 at 13:28

2 Answers2

0

If you want to use multiple fields of a dataframe row, you need to use something like:

df['dx'] = df.apply(lambda row:dx_function(row),axis=1)

in your main program, where 'row' is being passed to function dx_function.

In the function dx_function, use:

x1 = row['x1'] y1 = row['y1']

The function would return the result.

If you wanted to modify just one field, use:

df['name'] = df['name'].map(lambda x: x.lower())

This would change the text in the column 'name' to lower case.

Hope this helps.

c t
  • 11
  • 5
0

Yes. I understand that you want to calculate distance on each row values without looping. You can use apply() (look at documentation) using a lambda function as shown below:

df['distance'] = df.apply(lambda row: geodesic((row['x1'], row['y1']), (row['x2'], row['y2'])), axis=1)

This code applies lambda function to each pair of points on each row and stores the results into new column 'distance' on the same row.

Amir
  • 685
  • 3
  • 13
  • 36