So my two dataframes look like this
spatial_wi_df (contains POINTS)
root
|-- timestamp: timestamp (nullable = true)
|-- wagonnumber: long (nullable = true)
|-- latitude: double (nullable = true)
|-- longitude: double (nullable = true)
|-- speed: double (nullable = true)
|-- geometry: geometry (nullable = false)
|-- wi_geometry_meter: geometry (nullable = true)
spatial_station_groups_gdf (contains POLYGONS)
root
|-- geo_name: string (nullable = true)
|-- polygon: geometry (nullable = false)
In the end I want to check if any points from spatial_wi_df are contained by polygons from spatial_station_groups_gdf:
spatial_wi_df.createOrReplaceTempView("points")
spatial_station_groups_gdf.createOrReplaceTempView("geofences")
spatial_join_result = spark_sedona.sql("SELECT g.geo_name, p.wagonnumber FROM points AS p, geofences AS g WHERE ST_Contains(g.polygon, p.geometry)")
I already tried 3 Approaches to fix the POLYGONS, but everytime the same Error Message:
Buffer Option
spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_buffer")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, ST_Buffer(spatial_station_gdf_buffer.polygon, 0) AS polygon_buffered FROM spatial_station_gdf_buffer")
MakeValid
spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_valid")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, spatial_station_gdf_valid.polygon FROM spatial_station_gdf_valid LATERAL VIEW ST_MakeValid(polygon, false) spatial_station_gdf_valid AS polygon_valid")
Draw ConvexHull
spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_hull")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, ST_ConvexHull(spatial_station_gdf_hull.polygon) AS polygon_hull FROM spatial_station_gdf_hull")
Any Experience/Solution with broken polygons issue?