Why is my code calculating an incorrect distance between transformed coordinates?

Question

I have a large file of points and I am trying to find the distance between those points and another set of points. Initially I used the to_crs function of geopandas to convert the crs so that I can get an accurate distance measure in terms of meters when I do df.distance(point). However, since the file is very large, it took too long just to convert the crs of the file. The code was running for 2 hours and it still did not finish converting. Therefore, I used this code instead.

inProj = Proj(init='epsg:4326')
outProj = Proj(init='epsg:4808')

for index, row in demand_indo_gdf.iterrows():
    o = Point(row['origin_longitude'], row['origin_latitude'])
    o_proj = Point(transform(inProj, outProj, o.x, o.y))

    for i, r in bus_indo_gdf.iterrows():
        stop = r['geometry']
        stop_proj = Point(transform(inProj, outProj, stop.x, stop.y))
        print ('distance:', o_proj.distance(stop_proj), '\n\n')

I thought it may be faster to individually convert the crs and carry out my analysis. For this set of points:

o = (106.901024 -6.229162)
stop = (106.804 -6.21861)

I converted this EPSG 4326 coordinates to the local projection, EPSG 4808, and got this:

o_proj = (0.09183386384156803 -6.229330112968891)
stop_proj = (-0.005201753272169649 -6.218776788266844)

This gave a distance measure of 0.09760780527657992. Google maps gave me a distance measure, for coordinates o and stop, of 10.79km. It looks like the distance measure of my code gives an answer that is 10^-3 times smaller than the actual distance. Why is that so? Is my code correct?

score 1 · Accepted Answer · answered Jun 19 '19 at 07:03

The coordinates you are using are in degrees, not in meters.

The Points class that you are using probably does not care about this and calculates the cartesian distance between them.

Please use the Haversine equation, e.g. the one from here:

from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles
    return c * r

Then your code returns the correct result:

from pyproj import Proj, transform

inProj = Proj(init='epsg:4326')
outProj = Proj(init='epsg:4808')

o = (106.901024, -6.229162)
stop = (106.804, -6.21861)

o_proj = transform(inProj, outProj, o[0], o[1])
stop_proj = transform(inProj, outProj, stop[0], stop[1])

print(haversine(o_proj[0], o_proj[1], stop_proj[0], stop_proj[1]))

Why is my code calculating an incorrect distance between transformed coordinates?

1 Answers1