0

enter image description hereI'm trying to parse a Shapefile using Sedona on an EMR with PySpark and it fails with the error (please the error message attached to email). When I try to parse a sample Polygon shapefile which I got from internet it works fine, but it doesn't work with the Shapefile my organization is using. Also I tried to parse my organization shapefile on my local machine by creating virtual environment as mentioned here - https://geopandas.org/en/stable/getting_started/install.html#creating-a-new-environment, it works fine. But, again when I tried installing the geopandas and shapely on my existing virtual environment where there are other packages in it, I got the same error what I was getting on EMR, see below: I was assuming it could be because of some packages that is causing this issue, so I created an EMR with the same packages and versions from the virtual environment that was working as mentioned above and its still throwing the same error By, the way I'm creating an EMR as per the instructions given here - https://sedona.apache.org/latest-snapshot/setup/emr/ Please find attached the packages list of my working virutal env, non working virtual env and non-working EMR

Output exceeds the size limit. Open the full output data in a text editor

GEOSException Traceback (most recent call last)

GEOSException: IllegalArgumentException: Points of LinearRing do not form a closed linestring ############################################################################################################################

I'm trying to parse the Shapefile using Sedona but its erroring out. Its working for shapefile I got from internet but the shapefile from our org doesn't work. Any help provided is appreciared

0 Answers0