0

Say, I need to do wordcount like processing but for every 5 minutes. So i am using tumbling windows, but in the output what i see is the intermittent changelog counts also. I want to see only the final counts for the window in the output. Is there a way to achieve this.

Michal Borowiecki
  • 4,244
  • 1
  • 11
  • 18
Raghvendra Singh
  • 952
  • 9
  • 17
  • 4
    Possible duplicate of [How to send final kafka-streams aggregation result of a time windowed KTable?](https://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable) – Michal Borowiecki Jul 05 '17 at 09:25
  • 1
    but an interesting discussion is even on the Kafka users list : https://mail-archives.apache.org/mod_mbox/kafka-users/201706.mbox/browser – ppatierno Jul 05 '17 at 10:09
  • @ppatiemo yeah looks interesting, btw which thread exactly should i look for in the discussions. i think there should be an option in future release for setting if user wants to output only final counts because changelog is anyways present in the internal replicated topics which are also being stored in the kafka. – Raghvendra Singh Jul 05 '17 at 14:07
  • 1
    This is a direct link to the relevant discussion: http://search-hadoop.com/m/Kafka/uyzND1hbBpB1FPBRb2?subj=Re+Kafka+Streams+vs+Spark+Streaming+reduce+by+window – Michal Borowiecki Jul 07 '17 at 21:51

0 Answers0