0

During a window of stream if commit happens, then the window data is pushed to the topic and once it resumes the same window which was interrupted due to commit is again pushed to topic with latest value? Why the same window is pushed twice with updated value instead of pushing updated window value to topic? Is there a way to do it

Sai Chand
  • 107
  • 3
  • 13
  • 1
    Possible duplicate of [How to send final kafka-streams aggregation result of a time windowed KTable?](http://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable) – Matthias J. Sax May 18 '17 at 17:56
  • This blog post, also explains it: https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/ – Matthias J. Sax May 18 '17 at 17:57
  • kafka stream commit is getting triggered at a internal during which the window process is interrupted and the window is sent to a topic . But once commit is completed the window is getting resumed and is processed with the latest data and is sent again to the topic `KStream, Long> countByGroup = newKeyStream.groupByKey().count(TimeWindows.of(10000), "countbygrouped").toStream(); --- commit is triggered during a window and interrupts it` – Sai Chand May 19 '17 at 05:51
  • kafka stream commit is pushing the interrupted window data to down processor r topic without the completion of its windowing period due to commit and is resuming the window after commit and again pushing the window with latest values to down processor r topic. Instead of pushing the same window twice , is there way to push the window only once . – Sai Chand May 19 '17 at 06:08
  • Possible duplicate of [Kafka Streams - Hopping windows - deduplicate keys](http://stackoverflow.com/questions/43771904/kafka-streams-hopping-windows-deduplicate-keys) – miguno May 19 '17 at 12:09
  • In addition to what Matthias J. Sax wrote above, you can also take a look at the question https://stackoverflow.com/questions/43771904/kafka-streams-hopping-windows-deduplicate-keys. – miguno May 19 '17 at 12:09
  • 1
    As explained on the other question: Streams compute the new (ie, current) window result on each update record. There is no notion of a "final result" -- furthermore, any intermediate result might get pushed downstream even if not commit is happening. Streams follows a different philosophy that is closer to "pure stream processing". If you want to get only a single result downstream per window, you need to do this using custom code (ie, use .transform() with a state) -- but it would be better to be able to handle "updates" downstream natively. – Matthias J. Sax May 19 '17 at 17:00

0 Answers0