Questions tagged [apache-sedona]

22 questions
3
votes
0 answers

Apache Sedona - Query a Non-Rectangular Raster Mask Using Spark on Databricks

I'm referring to the manual on how to query spatial data, but in my scenario, there is a pixel-wise raster dataset of climate data over 8000+ days. In simple words, the query could look something like this: select temperature data for 5000 days for…
eyeballpaul
  • 163
  • 1
  • 10
1
vote
0 answers

Pyspark Sedona - How to retain all fields from input SparkDataFrames when performing spatial join with Sedona 1.4.0

I have recently switched from using Sedona 1.3.1 to 1.4.0. Previously, the output of the spatial join included all columns from both input SparkDataFrames. However, now the output only includes the geometry columns from each SparkDataFrame. I want…
Obi
  • 51
  • 4
1
vote
0 answers

What is the cause of this error? Code was working before and randomly started to error out

I am writing pyspark code to get geo census results and the code below was working as expected. However, now I get the following error: Py4JJavaError: An error occurred while calling o137.showString. : java.util.concurrent.ExecutionException: Boxed…
1
vote
1 answer

Remove rows with invalid polygon values in a PySpark data frame?

We are using a PySpark function on a data frame which throws us an error. The error is most likely due to a faulty row in the data frame. Schema of data frame looks like: root |-- geo_name: string (nullable = true) |-- geo_latitude: double (nullable…
1
vote
0 answers

Pyspark Sedona: Want to Spatial Join but got Error "Points of LinearRing do not form a closed line string"

So my two dataframes look like this spatial_wi_df (contains POINTS) root |-- timestamp: timestamp (nullable = true) |-- wagonnumber: long (nullable = true) |-- latitude: double (nullable = true) |-- longitude: double (nullable = true) |-- speed:…
pi_janes
  • 63
  • 5
1
vote
1 answer

AnalysisException: need struct type but got string

I have created a table in Databricks create table TabA (latitude float, longitude float, col1 string,col2 string) utils.executequery( """ update TabA set col1 = ST_Envelope(col2)""" ) I tried converting this output as string but getting error as…
Vidhya
  • 13
  • 1
  • 3
0
votes
1 answer

How to do Spatial query in spark & Scala 3.3.1 and 2.12. with apache sedona to find the intersection in coordinates

I'm trying to do Spatial operation on data which i have lat/long and one static geojson file. Now i need to load the geojson and find for each row from DF lat/long if they belong to which location using intersection. source Data…
0
votes
2 answers

How to find the closest geospatial line to a geospatial point

Context I have 500,000+ road objects for all the roads in the state of Illinois that have a Geoshape property for a line. I additionally have a set of objects for points across the state. Need I would like to add to the backing dataset of the points…
0
votes
0 answers

How to run function ST_GeomFromWKT within sedona context

I am trying to execute this example to create spatial data frame from csv using sedona SparkSession sparkSession = SedonaContext.builder() .master("local[*]") // Delete this if run in cluster mode .appName("readTestScala") //…
user1298426
  • 3,467
  • 15
  • 50
  • 96
0
votes
1 answer

Unable to read shapefile using sedona and pyspark, for file in hdfs

UPDATE: Passing the shapefile folder instead of .shp as a file and using the jar file: sedona-spark-shaded-3.0_2.12-1.4.0.jar seemed to do the work. Thanks to Jia Yu - Apache Sedona! I have the naturalearth_lowres shapefile stored in my hdfs, and I…
0
votes
0 answers

Passing geojson via Spark datasource

I'm trying to write a dataframe to ArangoDB where one of the columns is a GeoJSON object. I have tried passing it as a string, but the double quotes are being escaped, so ArangoDB won't interpret it as GeoJSON type. If I make it a geometry column…
0
votes
0 answers

Generating 'geometry' column based on lat/long from geoparquet set, before running spark SQL

I would greatly appreciate any comments and help on a seemingly trivial issue I have. I am running SparkSQL geospatial query from a PySpark Client. One of the input datasets later used in SQL is a set of points in geoparquet format and stored in…
View Delft
  • 31
  • 2
0
votes
0 answers

Parsing a Shapefile using Sedona/PySpark on EMR fails

enter image description hereI'm trying to parse a Shapefile using Sedona on an EMR with PySpark and it fails with the error (please the error message attached to email). When I try to parse a sample Polygon shapefile which I got from internet it…
0
votes
1 answer

apache-sedona error while trying to convert to pandas

I'm trying to execute .toPandas() command on an pyspark.sql.dataframe.DataFrame but it throws me an error pm_table_bej_test.toPandas() Py4JJavaError Traceback (most recent call…
Amri Rasyidi
  • 172
  • 1
  • 10
0
votes
1 answer

Shapely Buffering, not working as expected

Why does buffering one of my geometries have an unexpected hole in it? from shapely import LineString from geopandas import GeoDataFrame l = LineString([ (250,447), (319,446), (325,387), (290,374), (259,378), (254,385), (240,409), …
A. West
  • 571
  • 5
  • 12
1
2