1

I have one problem in running Kafka in the cluster. I explain it one by one. First, When I run Kafka commands on the cluster CSSH from my computer, I get this error:

Error while executing topic command : Replication factor: 2 larger than available brokers: 1. [2019-01-06 15:12:36,587] ERROR org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 2 larger than available brokers: 1. (kafka.admin.TopicCommand$)

In fact, I run CSSH on my computer to access to the cluster, after running Zookeeper and Kafka server on the cluster, when I run command of creating a topic I get the error. In the cluster, I have these setting on the server.properties on node1:

broker.id=1
port=9092
listeners=PLAINTEXT://150.20.11.137:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
sockeet.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=10073741824
log.retention.check.interval.ms=300000
zookeeper.connect= 150.20.11.134:2186, 150.20.11.137:2186, 
150.20.11.157:2186
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0

In zookeeper.properties of each node I have this setting:

 dataDir=/tmp/zookeeper
 clientPort=2186
 maxClientCnxns=0

Also, I run each command on the cluster nodes to run Kafka and Zookeeper:

./bin/zookeeper-server-start.sh ./config/zookeeper.properties
./bin/kafka-server-start.sh ./config/server.properties

After that, I want to create a topic with this command in the cluster and then I get above error on each node:

./bin/kafka-topics.sh --create --zookeeper localhost:2186 -- 
 replication-factor 2 --partitions 3 --topic testFlink

Would you please tell what the problem is exactly? And what is wrong in my cluster setting?

Thanks in advance.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
M_Gh
  • 1,046
  • 4
  • 17
  • 43
  • Are there any error in kafka broker log, during start? – Bartosz Wardziński Jan 06 '19 at 08:51
  • Dear @wardziniak thank you for your feedback. If you mean result of this command "./bin/kafka-server-start.sh ./config/server.properties", There is no error in that. – M_Gh Jan 06 '19 at 12:03
  • I think it is related to cluster properties. Would you please tell me what the right cluster setup is?thanks. – M_Gh Jan 06 '19 at 13:40

2 Answers2

2

I doubt that you've been able to form a cluster successfully. While booting up the cluster, please ensure you're starting all the 3 zookeeper nodes first, and then the three brokers. You can refer to this post, to check if Kafka has formed a cluster.

Update: I had overlooked the zookeeper properties you're using, which is missing the essential key-value pairs, required for creating a cluster. The following properties for zookeeper should be good to start with. Considering you have 3 zookeeper nodes, the zookeeper.properties or zoo.cfg (if it's a standalone zk) file on the zk nodes should look something like the following.

zk-1 properties

tickTime=2000
initLimit=10
syncLimit=5
clientPort=2186
dataDir=/opt/zookeeper/data
server.1=0.0.0.0:2888:3888
server.2=<zk_2-ip>:2888:3888
server.3=<zk_3-ip>:2888:3888

zk-2 properties

tickTime=2000
initLimit=10
syncLimit=5
clientPort=2186
dataDir=/opt/zookeeper/data
server.1=<zk_1-ip>:2888:3888
server.2=0.0.0.0:2888:3888
server.3=<zk_3-ip>:2888:3888

zk-3 properties

tickTime=2000
initLimit=10
syncLimit=5
clientPort=2186
dataDir=/opt/zookeeper/data
server.1=<zk_1-ip>:2888:3888
server.2=<zk_2-ip>:2888:3888
server.3=0.0.0.0:2888:3888

Okay, so before you start the zookeeper processes, you need to do one more thing. Checkout the dataDir property that you're using, in this example it's /opt/zookeeper/data. For each of the zookeepers, you've to create a file called myid and enter the value of 1 for zk-1, 2 for zk-2 and 3 for zk-3. Then you start the zookeepers, and it should form a cluster. You can use a bash cmd like echo "1" > /opt/zookeeper/data/myid for zk-1. Rest will be similar.

Bitswazsky
  • 4,242
  • 3
  • 29
  • 58
  • Thank you for your answer. In fact, I first run three zookeeper nodes first and then three brokers without any problems;but I have a problem to create a topic. I can make a topic on one broker. Would you please suggest me the best source to setup a cluster on different nodes physically?Thanks. – M_Gh Jan 06 '19 at 14:33
  • @M_Gh You can checkout the Confluent Ansible playbooks, and my fork that shows how to do it in VMs using Vagrant. In a few commands, you can get a cluster running locally. https://github.com/cricket007/cp-ansible/tree/addVagrant/vagrant – OneCricketeer Jan 07 '19 at 03:48
  • Dear @cicket_007 thanks but I have to set my cluster on different node physically. – M_Gh Jan 07 '19 at 04:46
  • Dear @Bitswazsky thank you very much for your complete answer, I will set up my cluster again and write the result here. Also,I think that I have a problem in set up of "server.properties" file for brokers. – M_Gh Jan 07 '19 at 04:51
  • Dear @Bitswazsky I set up "zookeeper.properties" according to your guidance and topic was built without any errors. But I cannot run producer and consumer command. `./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic testFlink --from-beginning` I got this error **[2019-01-07 09:17:16,861] WARN [Consumer clientId=consumer-1, groupId=console-consumer-36276] Connection to node -1 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)**. – M_Gh Jan 07 '19 at 05:48
  • 1
    Your broker is exposed though the ip, so localhost is not supposed to work. If you are submitting through broker-1, try using this cmd: `./bin/kafka-console-consumer.sh --bootstrap-server 150.20.11.137:9092 --topic testFlink --from-beginning` – Bitswazsky Jan 07 '19 at 05:50
  • Dear @Bitswazsky thanks a lot. It solve the issue. Would you please tell me if I want to have a special topic for each node I have to use IP address of that node while I am creating a topic on the node? – M_Gh Jan 07 '19 at 06:37
  • 1
    In a cluster, there's nothing like `special topic for each node`. If you have multiple partitions in a topic, those would be distributed across brokers. A topic is not tied to any particular broker. – Bitswazsky Jan 07 '19 at 06:41
  • Dear @Bitswazsky thank you for your complete guidance. Therefore, I do not need to use IP address of nodes while I am creating a topic on that node. Thanks a lot. – M_Gh Jan 07 '19 at 07:13
  • 1
    While you're creating a topic it takes the zookeeper host-port as an argument. Zookeepers talk to each other in the cluster internally, so, while querying for any topic metadata, you don't need to specify the same zookeeper host:port that you used for creating the topic.You can refer to the Kafka documentation for the same: https://kafka.apache.org/documentation/ – Bitswazsky Jan 07 '19 at 07:38
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/186310/discussion-between-bitswazsky-and-m-gh). – Bitswazsky Jan 07 '19 at 15:26
2

The only reason this error could occur is if your brokers are not connected to zookeeper cluster.
How are you running your Kafka broker? Are you running it as a service or a daemon process? Can you share logs?

You can check if a broker is running simply by running the command

nc -vz 150.20.11.137 9092

This would fail if the broker is not running.

I would suggest you use a zookeeper WEB UI like Zookeeper Navigator. Running it on docker is the easiest way. Here's the docker compose file for it. This will help you identify if there's any problem with zookeeper and all brokers are indeed connected to it.

I would also suggest you use yahoo Kafka manager. Its a web UI for your cluster and the most elegant way to manage your cluster. Here's docker for it.

And yes, Your zookeeper properties need to include the server IPs of all other zookeeper nodes like explained by Bitswazsky.

Ankur rana
  • 580
  • 10
  • 27
  • Dear @Ankur rana thank you for your attention. I run my Kafka broker as a service now. For first time I ran it as a daemon process even though I have this error in both of them. – M_Gh Jan 07 '19 at 04:56
  • If you are still facing the problem, Can you share kafka logs? – Ankur rana Jan 07 '19 at 05:32
  • Dear @Ankur rana I set the zookeeper cluster without any problem but I cannot run consumer and producer that I think it is related to Kafka cluster. – M_Gh Jan 07 '19 at 05:50
  • How do you know that your cluster is actually set? – Ankur rana Jan 07 '19 at 05:54
  • Zookeeper server is run without any problem. Also, Kafka server is. But I do not know if there is any problem on Kafka or Zookeeper server. Do I have to set up **producer.properties** and **consumer.properties** too? – M_Gh Jan 07 '19 at 06:29
  • 1
    yes... ofcource, you'll have to atleast set bootstrap.servers properties. How are you producing messages in the cluster? – Ankur rana Jan 07 '19 at 06:40
  • I have to receive my data from socket so I send my data from file to the socket via `cat file.csv | nc -lk myport`. Now I have to distribute this data to the Kafka cluster equally. Therefore, I want to use **Apache Nifi**, however I have never worked with **Nifi** at all and I have no idea what problems I face with. – M_Gh Jan 07 '19 at 06:47
  • Dear @Ankur rana, I posted my code that I use now in this link. Please see that: [https://stackoverflow.com/q/54006023/6640504] – M_Gh Jan 07 '19 at 07:10
  • this seems correct. Have you tried producing data using kafka-console-producer.sh? – Ankur rana Jan 07 '19 at 07:20
  • If you mean that I run producer with **kafka-console-producer.sh** and then write line by line data to it, I did that. But I have a stream of data and this way does not help me. – M_Gh Jan 07 '19 at 07:53
  • I dont understand... Are you still facing the original problem? – Ankur rana Jan 07 '19 at 10:02
  • No, I setup Zookeeper again according to @Bitswazsky explanation and Kafka cluster works just fine. – M_Gh Jan 07 '19 at 11:02