0

My app has N instance running. The number of instances is always greater than the number of Kafka partitions. E.g. 6 instances of a consumer-group, consuming from 4 Kafka partitions... so, only 4 of the instances are actually consuming at any point.

In this context can I suspend a Kafka consumer Camel route, without causing Kafka to attempt to re-balance to other potential consumers? My understanding is that the suspended route would stop polling, causing the other to pick up the load.

burki
  • 6,741
  • 1
  • 15
  • 31
Darius X.
  • 2,886
  • 4
  • 24
  • 51

1 Answers1

1

This is not a Camel but a Kafka question. The rebalancing is handled by Kafka and triggered whenever a consumer explicitly leaves the consumer group or silently dies (does no more sending heartbeats).

Kafka 2.3 introduced a new feature called "Static Membership" to avoid rebalancing just because of a consumer restart.

But in your case (another consumer must take the load of a leaving consumer) I think Kafka must trigger a rebalancing over all consumers due to the protocol used.

See also this article for a quite deep dive into rebalancing and its trade-offs between availability and fault-tolerance.

Edit due to comments

If you want to avoid rebalancing, I think you would have to increase both session.timeout.ms (heartbeat interval) and max.poll.interval.ms (processing timeout).

But even if you set them very high I guess it would not work reliably because route suspension could still happen just before a heartbeat (simply bad timing).

See this q&a for the difference between session.timeout.ms and max.poll.interval.ms.

burki
  • 6,741
  • 1
  • 15
  • 31
  • Yes, maybe I should rephrase: if a consumer has no heartbeat, or if it does not poll, Kafka will try to re-balance. This implication seems to be that I cannot suspend my Camel route (in circuit-breaker fashion) unless I increase my "maxPollIntervalMs" on the consumer to a high-enough number. True? – Darius X. Mar 01 '20 at 21:36
  • And, @burki thanks for the links. I did not know about static membership. – Darius X. Mar 01 '20 at 21:37
  • 1
    I extended my answer a bit because there are two different "timeout" settings – burki Mar 02 '20 at 07:46