1

Let's say I have an application which consumes logs from kafka cluster. I want the application to periodically check for the availability of the cluster and based on that perform certain actions. I thought of a few approaches but was not sure which one is better or what is the best way to do this:

  1. Create a MessageProducer and MessageConsumer. The producer publishes heartbeatTopic to the cluster and the consumer looks for it. The issue that I think for this is, where the application is concerned with only consuming, healthcheck has both producing and consuming part.
  2. Create a MessageConsumer with a new groupId which continuously pools for new messages. This way the monitoring/healthcheck is doing the same thing which the application is supposed to do, which I think is good.
  3. Create a MessageConsumer which does something different from actually consuming the messages. Something like listTopics (https://stackoverflow.com/a/47477448/2094963) .

Which of these methods is more preferable and why?

SHB
  • 589
  • 1
  • 6
  • 18

1 Answers1

1

Going down a slightly different route here, you could potentially poll zookeeper (znode path - /brokers/ids) for this information by using the Apache Curator library.

Here's an idea that I tried and worked - I used the Curator's Leader Latch recipe for a similar requirement.

You could create an instance of LeaderLatch and invoke the getLeader() method. If at every invocation, you get a leader then it is safe to assume that the cluster is up and running otherwise there is something wrong with it.

I hope this helps.

EDIT: Adding the zookeeper node path where the leader information is stored.

Lalit
  • 1,944
  • 12
  • 20