1

So my two dataframes look like this

spatial_wi_df (contains POINTS)

root
|-- timestamp: timestamp (nullable = true)
|-- wagonnumber: long (nullable = true)
|-- latitude: double (nullable = true)
|-- longitude: double (nullable = true)
|-- speed: double (nullable = true)
|-- geometry: geometry (nullable = false)
|-- wi_geometry_meter: geometry (nullable = true)

spatial_station_groups_gdf (contains POLYGONS)

root
|-- geo_name: string (nullable = true)
|-- polygon: geometry (nullable = false)

In the end I want to check if any points from spatial_wi_df are contained by polygons from spatial_station_groups_gdf:

spatial_wi_df.createOrReplaceTempView("points")
spatial_station_groups_gdf.createOrReplaceTempView("geofences")
spatial_join_result = spark_sedona.sql("SELECT g.geo_name, p.wagonnumber FROM points AS p, geofences AS g WHERE ST_Contains(g.polygon, p.geometry)")

But got Error Message: enter image description here

I already tried 3 Approaches to fix the POLYGONS, but everytime the same Error Message:

Buffer Option

spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_buffer")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, ST_Buffer(spatial_station_gdf_buffer.polygon, 0) AS polygon_buffered FROM spatial_station_gdf_buffer")

MakeValid

spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_valid")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, spatial_station_gdf_valid.polygon FROM spatial_station_gdf_valid LATERAL VIEW ST_MakeValid(polygon, false) spatial_station_gdf_valid AS polygon_valid")

Draw ConvexHull

spatial_station_groups_gdf.createOrReplaceTempView("spatial_station_gdf_hull")
spatial_station_groups_gdf = spark_sedona.sql("SELECT *, ST_ConvexHull(spatial_station_gdf_hull.polygon) AS polygon_hull FROM spatial_station_gdf_hull")

Any Experience/Solution with broken polygons issue?

ZygD
  • 22,092
  • 39
  • 79
  • 102
pi_janes
  • 63
  • 5

0 Answers0