0

I have been developing with kafka streams for several years now. Recently, i got into a project that relies on spark structured streaming.

Going through the documentation, to my surprise i could not find something similar to kafka stream when it comes to updating aggregation with deletion. This is handled out of the box with Ktable in Kafka Streams.

Hence I am wondering if i am missing something.

I saw this question which is now 2 years old

Kafka delete (tombstone) not updating max aggregate in Spark Structured Streaming

This answer seem to suggest that simply filtering message with null value before the aggregation should do the trick.

However i can't find anything in the official documentation to confirm that, neither was the answer accepted.

So my question is how to handle deletion in spark structured streaming, such as in the case of updating an aggregate ?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
MaatDeamon
  • 9,532
  • 9
  • 60
  • 127

0 Answers0