I have a large file of points and I am trying to find the distance between those points and another set of points. Initially I used the to_crs
function of geopandas to convert the crs so that I can get an accurate distance measure in terms of meters when I do df.distance(point)
. However, since the file is very large, it took too long just to convert the crs of the file. The code was running for 2 hours and it still did not finish converting. Therefore, I used this code instead.
inProj = Proj(init='epsg:4326')
outProj = Proj(init='epsg:4808')
for index, row in demand_indo_gdf.iterrows():
o = Point(row['origin_longitude'], row['origin_latitude'])
o_proj = Point(transform(inProj, outProj, o.x, o.y))
for i, r in bus_indo_gdf.iterrows():
stop = r['geometry']
stop_proj = Point(transform(inProj, outProj, stop.x, stop.y))
print ('distance:', o_proj.distance(stop_proj), '\n\n')
I thought it may be faster to individually convert the crs and carry out my analysis. For this set of points:
o = (106.901024 -6.229162)
stop = (106.804 -6.21861)
I converted this EPSG 4326 coordinates to the local projection, EPSG 4808, and got this:
o_proj = (0.09183386384156803 -6.229330112968891)
stop_proj = (-0.005201753272169649 -6.218776788266844)
This gave a distance measure of 0.09760780527657992. Google maps gave me a distance measure, for coordinates o
and stop
, of 10.79km. It looks like the distance measure of my code gives an answer that is 10^-3 times smaller than the actual distance. Why is that so? Is my code correct?