Azure databricks avro integration on Azure Schema registry

Question

I would like to be able to write a pyspark dataframe to kafka as avro, with automatically registering schema with the Azure schema registry in EventHub.

This should be possible according to docs e.g. https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/avro-dataframe

However the code below fails with

TypeError: to_avro() takes from 1 to 2 positional arguments but 3 were given

So I am probably not importing the right to_avro method? My environment is Azure Databricks - DBR 11.3 LTS

from pyspark.sql.functions import col, lit
from pyspark.sql.avro.functions import from_avro, to_avro

schema_registry_addr = "https://someeventhub.servicebus.windows.net:8081"

df \
  .select(
    to_avro(col("key"), lit("t-key"),
            schema_registry_addr).alias("key"),
    to_avro(col("value"), lit("t-value"), schema_registry_addr).alias("value")) \
    .writeStream \
    .format("kafka") \
    .option("kafka.sasl.mechanism", "PLAIN") \
    .option("kafka.security.protocol", "SASL_SSL") \
    .option("kafka.sasl.jaas.config", EH_SASL) \
    .option("kafka.bootstrap.servers", BOOTSTRAP_SERVERS) \
    .option("topic", TOPIC) \
     .option("checkpointLocation", "/FileStore/checkpoint") \
.start()

Correct. Databrick's `to_avro` function is not the same as Spark's native one (which does not accept any registry) — OneCricketeer, Feb 17 '23 at 17:09
@OneCricketeer Thanks, could you point me to how to import the correct to_avro method? the docs mentioned in my question import org.apache.spark.sql.avro.functions._ regardless whether they use the one with registry or the one without it. Thanks. — Tomáš Sedloň, Feb 19 '23 at 14:04
I don't use Databricks. I only know I've been confused myself when trying to find how to work with the schema registry in Spark. [Related posts](https://stackoverflow.com/questions/57950215/how-to-use-confluent-schema-registry-with-from-avro-standard-function) — OneCricketeer, Feb 21 '23 at 14:15

Azure databricks avro integration on Azure Schema registry

0 Answers0