In my spark streaming application, I'm trying to stream the data from Azure EventHub and writing onto couple of directories in the hdfs blob based on the data. Basically followed the link multiple writeStream with spark streaming
Below is the code:
def writeStreamer(input: DataFrame, checkPointFolder: String, output: String): StreamingQuery = {
input
.writeStream
.format("com.databricks.spark.avro")
.partitionBy("year", "month", "day")
.option("checkpointLocation", checkPointFolder)
.option("path", output)
.outputMode(OutputMode.Append)
.start()
}
writeStreamer(dtcFinalDF, "/qmctdl/DTC_CheckPoint", "/qmctdl/DTC_DATA")
val query1 = writeStreamer(canFinalDF, "/qmctdl/CAN_CheckPoint", "/qmctdl/CAN_DATA")
query1.awaitTermination()
What i currently observe is that, data is writing successfully to "/qmctdl/CAN_DATA directory but no data is getting written to "/qmctdl/DTC_DATA. Am i doing anything wrong here, any help would be appreciated much.