1

I am running a simple Kafka Streams program on my eclipse which is running successfully, but it is not able to implement the windowing concept.

I want to process all the messages received in a window of 5 seconds to the output topic. I googled and understand that I need to implement the tumbling window concept. However, I see that the output is sent to the output topic instantly.

What am I doing wrong here? Below is the main method that I am running:

    public static void main(String[] args) throws Exception {
    Properties props = new Properties();
    props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-wordcount");
    props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
    props.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
    props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
    props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");

    final StreamsBuilder builder = new StreamsBuilder();



    KStream<String, String> source = builder.stream("wc-input");

    @SuppressWarnings("deprecation")
    KTable<Windowed<String>, Long> counts = source
            .flatMapValues(new ValueMapper<String, Iterable<String>>() {
                @Override
                public Iterable<String> apply(String value) {
                    return Arrays.asList(value.toLowerCase(Locale.getDefault()).split(" "));
                }
            })
            .groupBy(new KeyValueMapper<String, String, String>() {
                @Override
                public String apply(String key, String value) {
                    return value;
                }
            })
            .count(TimeWindows.of(10000L)
                    .until(10000L),"Counts");

    // need to override value serde to Long type
    counts.to("wc-output");

    final Topology topology = builder.build();
    final KafkaStreams streams = new KafkaStreams(topology, props);
    final CountDownLatch latch = new CountDownLatch(1);

    // attach shutdown handler to catch control-c
    Runtime.getRuntime().addShutdownHook(new Thread("streams-wordcount-shutdown-hook") {
        @Override
        public void run() {
            streams.close();
            latch.countDown();
        }
    });

    try {

        streams.start();
        long windowSizeMs = TimeUnit.MINUTES.toMillis(50000); // 5 * 60 * 1000L
        TimeWindows.of(windowSizeMs);
        TimeWindows.of(windowSizeMs).advanceBy(windowSizeMs);

        latch.await();
    } catch (Throwable e) {
        System.exit(1);
    }
    System.exit(0);
}
Bartosz Wardziński
  • 6,185
  • 1
  • 19
  • 30
dijeah
  • 303
  • 2
  • 13
  • check this: https://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable/38945277#38945277 – Bartosz Wardziński Mar 11 '19 at 08:38

1 Answers1

1

Windowing does not mean "one output" per window. If you want to get only one output per window, you want so use suppress() on the result KTable.

Compare this article: https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137