I am using JDBC connector to stream data from mysql database to kafka topic. That works and I can see data in kafka topics using avro console consumer. Now I want to read this data to perform few simple filtering operations. I am planning to use Spark or Confluent Consumer. The problem in using spark is that I am not able to read data using Spark JavaInputDStream. I need to read data from kafka and deserialize from avro format to JSON in order to perform some filtering. I am not able to find examples in JAVA which I can refer. Can anyone point out some documentation or source?
Edit: I looked into this: https://spark.apache.org/docs/latest/sql-data-sources-avro.html
I have included Avro maven dependency in my java project:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>2.4.3</version>
</dependency>
I could not find to_avro and from_avro functions though. I am following this example:
Dataset<Row> output = df
.select(from_avro (col("value"), jsonFormatSchema).as("user"))
.where("user.favorite_color == \"red\"")
.select(to_avro (col("user.name")).as("value"))