2

I have a spark DataFrame with h3 hex ids and I am trying to obtain the polygon geometries.

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, expr
from pyspark.databricks.sql.functions import *

from mosaic import enable_mosaic
enable_mosaic(spark, dbutils)


# Create a Spark session
spark = SparkSession.builder.appName("Mosaic").getOrCreate()

# Create a DataFrame with hex IDs
df = spark.createDataFrame([
    (1, "87422c2a9ffffff"),
    (2, "87422c2a9000000"),
    (3, "87422c2a8ffffff")
], ("id", "h3hex_id"))



sdf2 = sdf1.withColumn("geometry", h3_boundaryaswkt(col("h3hex_id")))
sdf2.sample(fraction=0.1).show()

AnalysisException: [H3_NOT_ENABLED] h3_boundaryaswkt is disabled or unsupported. Consider enabling Photon or switch to a tier that supports H3 expressions; 


sdf2 = sdf1.withColumn("geometry", grid_boundary(col("h3hex_id"), format_name="WKT"))
sdf2.sample(fraction=0.1).show()


AnalysisException: [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `WKT` cannot be resolved. Did you mean one of the following? ..

I have installed databricks-mosaic 0.3.10 on the cluster.

How do I resolve the exception and apply the function spark DataFrame?

https://databrickslabs.github.io/mosaic/api/spatial-indexing.html

https://docs.databricks.com/sql/language-manual/functions/h3_boundaryaswkt.html#examples

kms
  • 1,810
  • 1
  • 41
  • 92

1 Answers1

1

h3_ functions are reserved for photon enabled clusters. If you change the configuration on your cluster to enable photon this error will disappear. mosaic provides a grid_ function that use a slower java implementation of h3 that you can use on either ML databricks runtime or photon databricks runtime envs.

The second issue with "WKT" is that it is provided as a string but should be a column due to the way python virtualises column names in pyspark. If you replace "WKT" with F.lit("WKT") your code will work.

from pyspark.sql import functions as F
sdf2 = sdf1.withColumn("geometry", grid_boundary(col("h3hex_id"), 
    format_name=F.lit("WKT")))
sdf2.sample(fraction=0.1).show()
Alex Ott
  • 80,552
  • 8
  • 87
  • 132