24

When I am write a topic to kafka,there is an error:Offset commit failed:

2016-10-29 14:52:56.387 INFO [nioEventLoopGroup-3-1][org.apache.kafka.common.utils.AppInfoParser$AppInfo:82] - Kafka version : 0.9.0.1
2016-10-29 14:52:56.387 INFO [nioEventLoopGroup-3-1][org.apache.kafka.common.utils.AppInfoParser$AppInfo:83] - Kafka commitId : 23c69d62a0cabf06
2016-10-29 14:52:56.409 ERROR [nioEventLoopGroup-3-1][org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$DefaultOffsetCommitCallback:489] - Offset commit failed.
org.apache.kafka.common.errors.GroupCoordinatorNotAvailableException: The group coordinator is not available.
2016-10-29 14:52:56.519 WARN [kafka-producer-network-thread | producer-1][org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater:582] - Error while fetching metadata with correlation id 0 : {0085000=LEADER_NOT_AVAILABLE}
2016-10-29 14:52:56.612 WARN [pool-6-thread-1][org.apache.kafka.clients.NetworkClient$DefaultMetadataUpdater:582] - Error while fetching metadata with correlation id 1 : {0085000=LEADER_NOT_AVAILABLE}

When create a new topic using command,it is ok.

./kafka-topics.sh --zookeeper localhost:2181 --create --topic test1 --partitions 1 --replication-factor 1 --config max.message.bytes=64000 --config flush.messages=1

This is the producer code using Java:

public void create() {
        Properties props = new Properties();
        props.clear();
        String producerServer = PropertyReadHelper.properties.getProperty("kafka.producer.bootstrap.servers");
        String zookeeperConnect = PropertyReadHelper.properties.getProperty("kafka.producer.zookeeper.connect");
        String metaBrokerList = PropertyReadHelper.properties.getProperty("kafka.metadata.broker.list");
        props.put("bootstrap.servers", producerServer);
        props.put("zookeeper.connect", zookeeperConnect);//声明ZooKeeper
        props.put("metadata.broker.list", metaBrokerList);//声明kafka broker
        props.put("acks", "all");
        props.put("retries", 0);
        props.put("batch.size", 1000);
        props.put("linger.ms", 10000);
        props.put("buffer.memory", 10000);
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        producer = new KafkaProducer<String, String>(props);
    }

Where is wrong?

Dolphin
  • 29,069
  • 61
  • 260
  • 539

6 Answers6

38

I faced a similar issue. The problem was when you start your Kafka broker there is a property associated with it, KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR.

If you are working with a single node cluster make sure you set this property with the value 1. As its default value is 3. This change resolved my problem. You can check the value in the Kafka.properties file.

Note: I was using base image of confluent Kafka version 4.0.0 ( confluentinc/cp-kafka:4.0.0)

Yuri
  • 4,254
  • 1
  • 29
  • 46
Vishal Akkalkote
  • 1,151
  • 12
  • 16
  • 1
    I had the same problem and this answer solved it for me. This answer should be marked as accepted. – fgakk Jan 10 '19 at 18:18
  • Both vishal-akkalkote and fgakk are correct! Setting offsets.topic.replication.factor=1 has made the GroupCoordinator to be available now and consumer to work as expected. – nitinr708 Mar 06 '19 at 13:30
  • Seems like a dangerous thing to do ? Why would you not want to have replication on offsets ? – Paul Praet Aug 24 '20 at 11:37
  • 1
    @PaulPraet prototyping, testing, demos etc. – boycy Nov 09 '20 at 12:31
  • @PaulPraet easy to try few things locally with a single node – Vishal Akkalkote Nov 09 '20 at 14:26
  • 3
    Specifically, the `offsets.topic.replication.factor` and `transaction.state.log.replication.factor` broker properties may need to be adjusted to match the number of brokers. – Gray Jan 15 '21 at 21:22
  • I think the following need also to be tweaked: `default.replication.factor` `min.insync.replicas` – Gray Jan 15 '21 at 22:38
  • See: https://stackoverflow.com/questions/65744538/problems-with-amazon-msk-default-configuration-and-publishing-with-transactions – Gray Jan 15 '21 at 23:13
12

Looking at your logs the problem is that cluster probably don't have connection to node which is the only one know replica of given topic in zookeeper.

You can check it using given command:
kafka-topics.sh --describe --zookeeper localhost:2181 --topic test1

or using kafkacat:
kafkacat -L -b localhost:9092

Example result:

Metadata for all topics (from broker 1003: localhost:9092/1003):
 1 brokers:
  broker 1003 at localhost:9092
 1 topics:
  topic "topic1" with 1 partitions:
    partition 0, leader -1, replicas: 1001, isrs: , Broker: Leader not available

If you have single node cluster then broker id(1001) should match leader of topic1 partition.
But as you can see the only one known replica of topic1 was 1001 - which is not available now, so there is no possibility to recreate topic on different node.

The source of the problem can be an automatic generation of broker id(if you don't have specified broker.id or it is set to -1).
Then on starting the broker(the same single broker) you probably receive broker id different that previously and different than was marked in zookeeper (this a reason why partition deletion can help - but it is not a production solution).

The solution may be setting broker.id value in node config to fixed value - according to documentation it should be done on produciton environment:
broker.id=1

If everything is alright you should receive sth like this:

Metadata for all topics (from broker 1: localhost:9092/1001):
 1 brokers:
  broker 1 at localhost:9092
 1 topics:
  topic "topic1" with 1 partitions:
    partition 0, leader 1, replicas: 1, isrs: 1

Kafka Documentation: https://kafka.apache.org/documentation/#prodconfig

tk3
  • 990
  • 1
  • 13
  • 18
Rafał Solarski
  • 166
  • 2
  • 8
2

Hi you have to keep your kafka replicas and replication factor for your code same.

for me i keep 3 as replicas and 3 as replication factor.

Amit Ahuja
  • 299
  • 2
  • 4
0

The solution for me was that I had to make sure KAFKA_ADVERTISED_HOST_NAME was the correct IP address of the server.

Mauvis Ledford
  • 40,827
  • 17
  • 81
  • 86
0

We had the same issue and replicas and replication factors both were 3. and the Partition count was 1 . I increased the partition count to 10 and it started working.

-3

We faced same issue in production too. The code was working fine for long time suddenly we got this exception.

We analyzed that there is no issue in code. So we asked deployment team to restart the zookeeper. Restarting it solved the issue.

Zoe
  • 27,060
  • 21
  • 118
  • 148
Kannan Msk
  • 157
  • 1
  • 6