2

How do you get the distance in kilometers using the haversine pairwise function in sklearn library? Looking over the example at https://stackoverflow.com/a/38685263/8378399 the numbers returned from scikit-learn are not correct which leads me to believe I'm not calling it correctly.

Sample code:

from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

from sklearn.neighbors import DistanceMetric
dist = DistanceMetric.get_metric('haversine')

paris = (48.8566, 2.3522)
lyon = (45.7640, 4.8357)

hdist = haversine(paris[1],paris[0], lyon[1], lyon[0])
skdist = dist.pairwise([paris], [lyon]) * 6371

# Returns: The distance between Paris and Lyon is 391km. sklearn=17766km
"The distance between Paris and Lyon is {0:.3g}km. sklearn={1:.5g}km".format(hdist, skdist[0][0])

1 Answers1

0

From sklearn docs:

Note that the haversine distance metric requires data in the form of [latitude, longitude] and both inputs and outputs are in units of radians.

So, convert latitude and longitude to radians before applying the function:

 skdist = dist.pairwise(np.radians([paris]), np.radians([lyon])) * 6371
Viktoriya Malyasova
  • 1,343
  • 1
  • 11
  • 25