0

I'm looking for fastest way to get distance between two latitude and longitude. One pair is from user and the other pair is from marker. Below is my code :

import geopy
import pandas as pd


marker = pd.read_csv(file_path)
coords_2 = (4.620881605,101.119911)
marker['Distance'] = round(geopy.distance.geodesic((marker['Latitude'].values,marker['Longitude'].values), (coords_2)).m,2)

Previously, I used apply which is extremely slow :

marker['Distance2'] = marker.apply(lambda x: round(geopy.distance.geodesic((x.Latitude,x.Longitude), (coords_2)).m,2), axis = 1)

Then, I used Pandas Series vectorization :

marker['Distance'] = round(geopy.distance.geodesic((marker['Latitude'].values,marker['Longitude'].values), (coords_2)).m,2)

I'm receiving error :

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I added all() and any() to test (such that marker['Latitude'].values.all(),marker['Longitude'].values.all() and vice versa). However, the result calculated was entirely wrong from both any() and all().

This is my result:

    Latitude    Longitude   Distance    Distance2
0   4.620882    101.119911  11132307.42 0.00
1   4.620125    101.120399  11132307.42 99.72
2   4.619368    101.120885  11132307.42 199.26

where Distance is the result from vectorization which is INCORRECT, whereas Distance2 is the result from using apply which is CORRECT. Simply, Distance2 is my expected outcome.

WITHOUT USING apply, I want to produce faster result with correct output.

Martin Gergov
  • 1,556
  • 4
  • 20
  • 29
dee
  • 39
  • 1
  • 8
  • try swifter or modin pandas with apply function https://github.com/jmcarpenter2/swifter https://github.com/modin-project/modin – Mohit Sharma Feb 26 '20 at 08:53
  • I found the other question with a search for *geopy + numpy*. Even if it is now old, I still think it relevant, specifically the `geopandas` part. Ping me in a comment if it is not enough to solve your problem and just tell me why if you want me to reopen the question. – Serge Ballesta Feb 26 '20 at 09:03
  • Can you share the other one as well? I'm looking through it now @SergeBallesta – dee Feb 26 '20 at 09:05
  • It seems impossible to install geopandas! – dee Feb 26 '20 at 09:24
  • @SergeBallesta your suggestion didn't work. Sorry – dee Feb 26 '20 at 09:48
  • I have reopened the question, but I am afraid that you will not find better answers that the ones from [How to use Vectorization with NumPy arrays to calculate geodesic distance using Geopy library for a large dataset?](https://stackoverflow.com/q/50275057/3545273) – Serge Ballesta Feb 26 '20 at 10:02
  • I'm referring to the one you suggested at the moment. – dee Feb 27 '20 at 00:28

0 Answers0