9

This is quite simple case, but I did not find any easy way to do it so far. The idea is to get a set of distances between all the points defined in a GeoDataFrame and the ones defined in another GeoDataFrame.

import geopandas as gpd
import pandas as pd

# random coordinates
gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))
print(gdf_1)
print(gdf_2)

#  distances are calculated elementwise
print(gdf_1.distance(gdf_2))

This produces the element-wise distance between points in gdf_1 and gdf_2 that share the same index (with also a warning because the two GeoSeries do not have the same index, which will be my case).

                geometry
0    POINT (0.000 0.000)
1   POINT (0.000 90.000)
2  POINT (0.000 120.000)
                    geometry
0    POINT (0.00000 0.00000)
1  POINT (0.00000 -90.00000)
/home/seydoux/anaconda3/envs/chelyabinsk/lib/python3.8/site-packages/geopandas/base.py:39: UserWarning: The indices of the two GeoSeries are different.
  warn("The indices of the two GeoSeries are different.")
0      0.0
1    180.0
2      NaN

The question is; how is it possible to get a series of all points to points distances (or at least, the unique combinations of the index of gdf_1 and gdf_2 since it is symmetric).

EDIT

  • In this post, the solution is given for a couple of points; but I cannot find a straightforward way to combine all points in two datasets.

  • In this post only element-wise operations are proposed.

  • An analogous question was also raised on the GitHub repo of geopandas. One of the proposed solution is to use the apply method, without any detailed answer.

Leonard
  • 2,510
  • 18
  • 37
  • 1
    Did you search? I recall many question/answers regarding calculating distances between all combinations of two sets of coordinates (geodetic or otherwise) that reside in arrays, lists, DataFrames. Your question is either too broad or probably a duplicate; and maybe off topic with the request for other libraries. – wwii Nov 09 '20 at 15:04
  • Yes, I did. I will put all related posts in the question. No answer for the combination case I raise here. – Leonard Nov 09 '20 at 15:15
  • The problem you are trying to solve is applying a function to all *combinations* of coordinates between two dataframes? And the part you are stuck on is *getting* the combinations? – wwii Nov 09 '20 at 15:28
  • That is correct @wwii. I am wondering (1) if such function would exist already or (2) how to combine all combinations of coordinates between two dataframes. – Leonard Nov 09 '20 at 15:35
  • Related:[Distance matrix between two point layers](https://stackoverflow.com/questions/58713739/distance-matrix-between-two-point-layers), – wwii Nov 09 '20 at 19:56
  • I don't have geopandas and I can't tell if the distance method will handle broadcasting but try this: `gdf_1['geometry'].distance(gdf_2['geometry'].values[:,None])` – wwii Nov 09 '20 at 19:59
  • Thanks for your help. I tried but it does not work (returns `ValueError: 'data' should be a 1-dimensional array of geometry objects.`). As in the answer provided by @martinfleis, a neat solution is to use the `apply` method. – Leonard Nov 09 '20 at 20:05

1 Answers1

13

You have to apply over each geometry in first gdf to get distance to all geometric in second gdf.

import geopandas as gpd
import pandas as pd

# random coordinates
gdf_1 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0, 0], [0, 90, 120]))
gdf_2 = gpd.GeoDataFrame(geometry=gpd.points_from_xy([0, 0], [0, -90]))

gdf_1.geometry.apply(lambda g: gdf_2.distance(g))
      0      1
0    0.0   90.0
1   90.0  180.0
2  120.0  210.0
martinfleis
  • 7,124
  • 2
  • 22
  • 30