0

If I create a simple topology where I have a source and a processor I am getting double the expected StreamThread in the consoles.

For example, if I set threads to one and have one partition I see 2 stream threads. If I set to 20 threads and have 20 partitions I see 40 stream threads.

Based on Kafka Streams thread number, I was expecting half of these numbers.

I am configuring something wrong or is this expected?

EDIT: After stream = new KafkaStreams(topology, streamsConfig); is called I only see it create 20 threads.

After stream.start() is called I see those 20 threads transition from CREATED to RUNNING.

It is only later in the initialization that the other 20 threads get created. It looks like StreamsBuilderFactoryBean#start() is then getting called where the topology contains nothing. Looks like I either need to stop this from getting called somehow or remove my creation process. Not sure what is preferred.

Chris
  • 1,299
  • 3
  • 18
  • 34
  • "in the consoles" -- what do you mean by this? – Matthias J. Sax Jan 02 '19 at 23:07
  • Debug console terminal in intellij I see statements such as: INFO 6845 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [rater-broker-fcfa7c57-7e0c-45c2-a09c-7ab35913c346-StreamThread-1] Starting – Chris Jan 03 '19 at 03:59
  • You should only see `StreamThread-1` if you have configured on thread in the StreamConfig. Not sure. Can you double check the config? It's logged, too. Also, you can verify the number of running threads via `KafkaStreams#localThreadMetadata()`. Last, each thread corresponds to a member of the underlying consumer group -- thus, you can check the number of members in the group, too. – Matthias J. Sax Jan 03 '19 at 11:53
  • When I see streamthread-40 I confirmed stream configs contains "num.stream.threads" -> "20" and localThreadMetadata() returns 20. I am unsure what you mean by your last point of the underlying consumer group. – Chris Jan 03 '19 at 17:38
  • A Kafka Streams application form a consumer group using the application.id as group.id. Each threads creates one consumer that is part of this group. Thus, the consumer group size should tell you how many threads are really running. You can inspect consumer groups via command line tool `bin/kafka-consumer-groups.sh`. I assume you see all numbers from 1 to 40 (still puzzles how this can happen). Because localThreadMetadata return 20, it seems that only 20 thread are running though... Btw: what version do you use? – Matthias J. Sax Jan 03 '19 at 20:55
  • It is weird. We are using 1.1.1. I even see things like stream-thread [xxx-StreamThread-40] State transition from PARTITIONS_ASSIGNED to RUNNING. However the last block of log statements on startup are threads 1-20 transition to running and I only see those threads reporting metrics. Not sure what to do or if this is really a problem? – Chris Jan 04 '19 at 15:57
  • @MatthiasJ.Sax added some more details to the question. – Chris Jan 04 '19 at 17:13
  • You should answer your own question instead of putting the solution into the question. Glad you figured it out. – Matthias J. Sax Jan 05 '19 at 15:12

1 Answers1

0

Turns out @EnableKafkaStreams annotation will start kafka streams for you. Since I was not creating a topology bean there was no collision as it was an empty topology.

Chris
  • 1,299
  • 3
  • 18
  • 34