2

So I'm trying the kafka quickstart as per the main documentation. Got the multi-cluster example all setup and test per the instructions and it works. For example, bringing down one broker and the producer and consumer can still send and receive.

However, as per the example, we setup 3 brokers and we bring down broker 2 (with broker id = 1). Now if I bring up all brokers again, but I bring down broker 1 (with broker id = 0), the consumer just hangs. This only happens with broker 1 (id = 0), does not happen with broker 2 or 3. I'm testing this on Windows 7.

Is there something special here with broker 1? Looking at the config they are exactly the same between all 3 brokers except the id, port number and log file location.

I thought it is just a problem with the provided console consumer which doesn't take a broker list, so I wrote a simple java consumer as per their documentation using the default setup but specify the list of brokers in the "bootstrap.servers" property, but no dice, still get the same problem.

The moment I startup broker 1 (broker id = 0), the consumers will just resume working. This isn't a highly available/fault tolerant behavior for the consumer... any help on how to setup a HA/fault tolerant consumer?

Producers doesn't seem to have an issue.

arislan
  • 281
  • 1
  • 2
  • 7

1 Answers1

1

If you follow the quick-start, the created topic should have only one partition with one replica which is hosted in the first broker by default, namely broker 1. That's why the consumer got failed when you brought down this broker.

Try to create a topic with multiple replicas(specifying --replication-factor when creating topic) and rerun your test to see whether it brings higher availability.

amethystic
  • 6,821
  • 23
  • 25
  • 1
    I followed the quick-start multi-cluster section which instructed me to create the topic with a replication factor of 3, so that isn't it. However it didn't ask say anything about multiple partition. Is the partition important? – arislan Jan 20 '17 at 22:19
  • what do you mean by 'important'? – amethystic Jan 21 '17 at 01:54
  • Important as in would it be the cause of my issue? The replication setting you suggested isn't as I've already done it as per the documentation in the quick-start. – arislan Jan 22 '17 at 22:19
  • I'm still struggling with the same problem. Did you find any relevant information on it? To my understanding, all replicas have the same 'importance', a.k.a. killing anyone should mean the same to the cluster – nahuelarjonadev Jul 02 '19 at 04:39