0

I have two Geodataframes. First GeoDataframe contains polygons columns, and the other one contains points (latitude and longitude). I want to do this to check whether the coordinates I was provided are inside the polygon of the city. Please click on the image below to see both dataframe.

GDF_1 contains polygon/multi-polygon

GDF_2 contains city points (coordinate)

When the ID is in gdf_1.id and gdf_2.id equal to each then uses the within the function to see if the coordinate in gdf_2 is inside the polygon in the gdf_1. For example, the code below would result in True because the coordinate is within the polygon.

poly = gdf_1[gdf_1.id == '17085']['geometry'] 
p1 = gdf_2[gdf_2.id == '17085']['geometry']
p1.within(poly, align=False)

I've been having hard time to iterate both Dataframe and compare them to each other. Is there anyways for me to compare both Dataframe to each other?

Desired output: (This is just an example)

id gdf_2.geometry bool
17085 POINT(19.82092 41.32791) True
4505 POINT(153.02560 -2746738) True
18526 POINT(145.12103 -37.85048) True
5049 POINT(146.36182 -41.18134) False
4484 POINT(150.84249 -33.80261) False
iplaygenji15
  • 9
  • 1
  • 3

2 Answers2

0

A similar question was answered some years ago. That solution matches points in a list to polygons in a series. I am copying a modified version of that solution which will work with two data frames.

In this code sample polydf is equivalent to your GDF_1 and pointdf is equivalent to your GDF_2. The "any" column in the example output is equivalent to your "bool" column.

from shapely.geometry import Point, Polygon
import geopandas

polydf = geopandas.GeoDataFrame({
    "polygons": ["A","B"],
    "geometry":[Polygon([(5, 5), (5, 13), (13, 13), (13, 5)]),
                Polygon([(10, 10), (10, 15), (15, 15), (15, 10)])]
})

pointdf = geopandas.GeoDataFrame({
    "points": ["a","b","c"],
    "geometry":[Point(3, 3), Point(8, 8), Point(11, 11)]
})

pointdf = pointdf.assign(**{row["polygons"]:pointdf.within(row["geometry"]) for index, row in polydf.iterrows()})

pointdf["any"] = pointdf.any(axis=1,bool_only=True).values

print(pointdf.to_markdown())

Output:

points geometry A B any
0 a POINT (3 3) False False False
1 b POINT (8 8) True False True
2 c POINT (11 11) True True True

This solution iterates over a table row-by-row. This practice is not optimal, so it's possible that a faster solution exists.

Juancheeto
  • 556
  • 4
  • 16
0

The geopandas api docs on geopandas.GeoSeries.within are really great and I recommend giving them a close read. If the two dataframes have an index which can be aligned, passing align=True to many of the spatial ops will tell geopandas to join on the index, acting essentially like a normal pandas join, while performing the spatial op (in this case, within) on each pair of geometries.

So I think the following should do exactly what you’re looking for:

gdf1.set_index("id").within(
    gdf2.set_index("id"),
    align=True,
)

This will be significantly faster than iterating over all rows.

Michael Delgado
  • 13,789
  • 3
  • 29
  • 54