I'm running a Kafka cluster on 3 EC2 instances. Each instance runs kafka (0.11.0.1) and zookeeper (3.4). My topics are configured so that each has 20 partitions and ReplicationFactor of 3.
Today I noticed that some partitions refuse to sync to all three nodes. Here's an example:
bin/kafka-topics.sh --zookeeper "10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181" --describe --topic prod-decline
Topic:prod-decline PartitionCount:20 ReplicationFactor:3 Configs:
Topic: prod-decline Partition: 0 Leader: 2 Replicas: 1,2,0 Isr: 2
Topic: prod-decline Partition: 1 Leader: 2 Replicas: 2,0,1 Isr: 2
Topic: prod-decline Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1
Topic: prod-decline Partition: 3 Leader: 1 Replicas: 1,0,2 Isr: 2,0,1
Topic: prod-decline Partition: 4 Leader: 2 Replicas: 2,1,0 Isr: 2
Topic: prod-decline Partition: 5 Leader: 2 Replicas: 0,2,1 Isr: 2
Topic: prod-decline Partition: 6 Leader: 2 Replicas: 1,2,0 Isr: 2
Topic: prod-decline Partition: 7 Leader: 2 Replicas: 2,0,1 Isr: 2
Topic: prod-decline Partition: 8 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1
Topic: prod-decline Partition: 9 Leader: 1 Replicas: 1,0,2 Isr: 2,0,1
Topic: prod-decline Partition: 10 Leader: 2 Replicas: 2,1,0 Isr: 2
Topic: prod-decline Partition: 11 Leader: 2 Replicas: 0,2,1 Isr: 2
Topic: prod-decline Partition: 12 Leader: 2 Replicas: 1,2,0 Isr: 2
Topic: prod-decline Partition: 13 Leader: 2 Replicas: 2,0,1 Isr: 2
Topic: prod-decline Partition: 14 Leader: 0 Replicas: 0,1,2 Isr: 2,0,1
Topic: prod-decline Partition: 15 Leader: 1 Replicas: 1,0,2 Isr: 2,0,1
Topic: prod-decline Partition: 16 Leader: 2 Replicas: 2,1,0 Isr: 2
Topic: prod-decline Partition: 17 Leader: 2 Replicas: 0,2,1 Isr: 2
Topic: prod-decline Partition: 18 Leader: 2 Replicas: 1,2,0 Isr: 2
Topic: prod-decline Partition: 19 Leader: 2 Replicas: 2,0,1 Isr: 2
Only node 2 has all the data in-sync. I've tried restarting brokers 0 and 1 but it didn't improve the situation - it made it even worse. I'm tempted to restart node 2 but I'm assuming it will lead to downtime or cluster failure so I'd like to avoid it if possible.
I'm not seeing any obvious errors in logs so I'm having a hard time figuring out how to debug the situation. Any tips would be greatly appreciated.
Thanks!
EDIT: Some additional info ... If I check the metrics on node 2 (the one with full data), it does realize that some partitions are not correctly replicated.:
$>get -d kafka.server -b kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions *
#mbean = kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions:
Value = 930;
Nodes 0 and 1 don't. They seem to think everything is fine:
$>get -d kafka.server -b kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions *
#mbean = kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions:
Value = 0;
Is this expected behaviour?