I'm trying to setup a zookeeper and kafka as separate Kubernetes deployments/pods in a shared namespace. I've bootstraped a local K8s 1.8 with Calico via kubeadm on my Ubuntu sandbox...
For the Zookeeper, I'm using the image zookeeper:3.4 from hub.docker.com and I created a Kubernetes deployment and service, where I expose ports: 2181 2888 3888. Service name is zookeeper and I assume I should be able to use it by this hostname from the pods in the namespace.
For the Kafka 1.0, I've created my own container image, that I can control with environment variables... I'm setting the zookeeper.connect to zookeeper:2181. I assume the Kubernetes DNS will resolve this and open the connection to the service.
Unfortunately I get:
[2018-01-03 15:48:26,292] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:48:32,293] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2018-01-03 15:48:46,286] INFO Opening socket connection to server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,299] INFO Socket connection established to zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,319] INFO Session establishment complete on server zookeeper.sandbox.svc.cluster.local/10.107.41.148:2181, sessionid = 0x10000603c560001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:48:46,331] INFO Session: 0x10000603c560001 closed (org.apache.zookeeper.ZooKeeper)
[2018-01-03 15:48:46,333] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server 'zookeeper:2181' with timeout of 6000 ms
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1233)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:115)
at kafka.utils.ZkUtils$.withMetrics(ZkUtils.scala:92)
at kafka.server.KafkaServer.initZk(KafkaServer.scala:346)
at kafka.server.KafkaServer.startup(KafkaServer.scala:194)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:92)
at kafka.Kafka.main(Kafka.scala)
So I was assuming I have a generic networking issue in my cluster, then I noticed something even more confusing for me... If I set zookeeper.connect to 10.107.41.148:2181 ( the current address of the zookeeper service ), the connection works ( at least from kafka to zookeeper ).
[2018-01-03 15:51:31,092] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-01-03 15:51:31,094] INFO Opening socket connection to server 10.107.41.148/10.107.41.148:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,105] INFO Socket connection established to 10.107.41.148/10.107.41.148:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-01-03 15:51:31,134] INFO Session establishment complete on server 10.107.41.148/10.107.41.148:2181, sessionid = 0x10000603c560005, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
With this setup I'm able to use the zookeeper service from the host of the kubernetes cluster to do for example "bin/kafka-topics.sh --list --zookeeper 10.107.41.148:2181". Producing a message does not work thou... I assume once the network is properly working, I need to add the kafka advertised adddress ...
kafka-console-producer.sh --broker-list 10.100.117.196:9092 --topic test1
>test-msg1
>[2018-01-03 17:05:35,689] WARN [Producer clientId=console-producer] Connection to node 0 could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
Any hints what is wrong with my Kubernetes network setup or at least where to start troubleshooting?
Thank you and best regards, Pavel