36

I am using dockerized Kafka and written one Kafka consumer program. It works perfectly when I run Kafka in docker and application at my local machine. But when I configured the local application in docker I am facing issues. The issue may be due to a topic not created until time application started.

docker-compose.yml

version: '3'
services:
  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - "2181:2181"
  kafka:
    image: wurstmeister/kafka
    ports:
      - "9092:9092"
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_CREATE_TOPICS: "test:1:1"
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
  parse-engine:
    build: .
    depends_on:
      - "kafka"
    command: python parse-engine.py
    ports:
     - "5000:5000"

parse-engine.py

from kafka import KafkaConsumer
import json

try:
    print('Welcome to parse engine')
    consumer = KafkaConsumer('test', bootstrap_servers='localhost:9092')
    for message in consumer:
        print(message)
except Exception as e:
    print(e)
    # Logs the error appropriately. 
    pass

Error log

kafka_1         | [2018-09-21 06:27:17,400] INFO [SocketServer brokerId=1001] Started processors for 1 acceptors (kafka.network.SocketServer)
kafka_1         | [2018-09-21 06:27:17,404] INFO Kafka version : 2.0.0 (org.apache.kafka.common.utils.AppInfoParser)
kafka_1         | [2018-09-21 06:27:17,404] INFO Kafka commitId : 3402a8361b734732 (org.apache.kafka.common.utils.AppInfoParser)
kafka_1         | [2018-09-21 06:27:17,431] INFO [KafkaServer id=1001] started (kafka.server.KafkaServer)
**parse-engine_1  | Welcome to parse engine
parse-engine_1  | NoBrokersAvailable 
parseengine_parse-engine_1 exited with code 0**
kafka_1         | creating topics: test:1:1

As I already added depends_on property in docker-compose but before starting topic application connecting so error occurred.

I read that I can possible to add the script in the docker-compose file but I am looking for some easy way.

Thanks for help

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Pankaj Saboo
  • 1,125
  • 1
  • 8
  • 27

4 Answers4

97

Your problem is the networking. In your Kafka config you're setting

KAFKA_ADVERTISED_HOST_NAME: localhost

but this means that any client (including your python app) will connect to the broker, and then be told by the broker to use localhost for any connections. Since localhost from your client machine (e.g. your python container) is not where the broker is, requests will fail.

You can read more about fixing problems with your Kafka listeners in detail here

So to fix your issue, you can do one of two things:

  1. Simply change your compose to use the internal hostname for Kafka (KAFKA_ADVERTISED_HOST_NAME: kafka). This means any clients within the docker network will be able to access it fine, but no external clients will be able to (e.g. from your host machine):

     version: '3'
     services:
     zookeeper:
         image: wurstmeister/zookeeper
         ports:
         - "2181:2181"
     kafka:
         image: wurstmeister/kafka
         ports:
         - "9092:9092"
         environment:
         KAFKA_ADVERTISED_HOST_NAME: kafka
         KAFKA_CREATE_TOPICS: "test:1:1"
         KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
         volumes:
         - /var/run/docker.sock:/var/run/docker.sock
     parse-engine:
         build: .
         depends_on:
         - "kafka"
         command: python parse-engine.py
         ports:
         - "5000:5000"
    

    Your clients would then access the broker at kafka:9092, so your python app would change to

     consumer = KafkaConsumer('test', bootstrap_servers='kafka:9092')
    
  2. Add a new listener to Kafka. This enables it to be accessed both internally and externally to the docker network. Port 29092 would be for access external to the docker network (e.g. from your host), and 9092 for internal access.

    You would still need to change your python program to access Kafka at the correct address. In this case since it's internal to the Docker network, you'd use:

     consumer = KafkaConsumer('test', bootstrap_servers='kafka:9092')
    

    Since I'm not familiar with the wurstmeister images, this docker-compose is based on the Confluent images which I do know:

    (editor has mangled my yaml, you can find it here)

     ---
     version: '2'
     services:
       zookeeper:
         image: confluentinc/cp-zookeeper:latest
         environment:
           ZOOKEEPER_CLIENT_PORT: 2181
           ZOOKEEPER_TICK_TIME: 2000
    
       kafka:
         # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
         # An important note about accessing Kafka from clients on other machines: 
         # -----------------------------------------------------------------------
         #
         # The config used here exposes port 29092 for _external_ connections to the broker
         # i.e. those from _outside_ the docker network. This could be from the host machine
         # running docker, or maybe further afield if you've got a more complicated setup. 
         # If the latter is true, you will need to change the value 'localhost' in 
         # KAFKA_ADVERTISED_LISTENERS to one that is resolvable to the docker host from those 
         # remote clients
         #
         # For connections _internal_ to the docker network, such as from other services
         # and components, use kafka:9092.
         #
         # See https://rmoff.net/2018/08/02/kafka-listeners-explained/ for details
         # "`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-'"`-._,-
         #
         image: confluentinc/cp-kafka:latest
         depends_on:
           - zookeeper
         ports:
           - 29092:29092
         environment:
           KAFKA_BROKER_ID: 1
           KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
           KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
           KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
           KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
           KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
    

Disclaimer: I work for Confluent

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
  • getting error after trying first solution "KafkaUnavailableError: All servers failed to process request: [('kafka', 9092, )]" in application – Pankaj Saboo Sep 25 '18 at 06:07
  • The yaml provided in the link : https://gist.github.com/rmoff/fb7c39cc189fc6082a5fbd390ec92b3d has a typo and did not work for me – Ojasvi Monga Aug 31 '19 at 16:02
  • @joe what's the typo? If you let me know I will update the gist – Robin Moffatt Sep 02 '19 at 08:48
  • In https://gist.github.com/rmoff/fb7c39cc189fc6082a5fbd390ec92b3d#file-docker-compose-yml-L36 , 9092 and 29092 should be inverted: KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092 – Ojasvi Monga Sep 02 '19 at 08:53
  • No, that's not correct. `kafka:29092` is how you reference is for hosts on the Docker network. `localhost:9092` is for clients connecting on the Docker host machine, which is why `9092` is exposed in the `ports` statement. Maybe start a new question with details of the problem that you're having using it? – Robin Moffatt Sep 02 '19 at 11:42
  • But in your answer, you've written: "Port 29092 would be for access external to the docker network" – Ojasvi Monga Sep 02 '19 at 12:04
  • Anyway, I should've mentioned that it worked after I switched 9092 and 29092 – Ojasvi Monga Sep 02 '19 at 12:05
  • Ah right, I see the confusion now. My gist has yaml different from the answer. I'll fix this now. Thanks for pointing it out. – Robin Moffatt Sep 03 '19 at 08:09
  • 2
    Wouldn't it be better to rather use the linux hostname (i.e. the value of `$(hostname)` in bash) of the docker host instead of `localhost` to also enable clients from other machines than the docker host? (as indicated in your references blogpost) – Holger Brandl Oct 01 '19 at 13:15
  • 1
    @RobinMoffatt If I am accessing the kafka server from a different network, would I have to change "KAFKA_ADVERTISED_HOST_NAME: PLAINTEXT://ipaddress:9092, BROKER://127.0.0.1:9092" ? That way the external app would be redirected to the server ip address? – Danielle Paquette-Harvey Mar 08 '21 at 19:39
  • And why kafka is pain, when making deployments and services in k8s? When I'm following the same structure... – Игор Ташевски Apr 22 '21 at 00:11
  • If you're on a mac and running into `Connection to node -1 (localhost/127.0.0.1:29092) could not be established. Broker may not be available.`, replace `localhost` with `host.docker.internal`. Now `KAFKA_ADVERTISED_LISTENERS` should look like `KAFKA_ADVERTISED_LISTENERS:PLAINTEXT://kafka:9092,PLAINTEXT_HOST://host.docker.internal:29092` Note that if you're connecting to kafka from another container, you'll need to replace `localhost` in the client's bootstrap servers with `host.docker.internal` as well. – Samuel Ridicur Nov 22 '22 at 00:43
  • "My Python/Java/Spring/Go/Whatever Client Won’t Connect to My Apache Kafka Cluster in Docker/AWS/My Brother’s Laptop. Please Help!" The following article is much better than the other one. THANK YOU! – Mark Jan 16 '23 at 23:31
  • I may be mistaken, but I believe there has been some update to the Kafka + Docker configuration. It could be that the Bitnami packaged Kafka environment variables are different. The relevant variables, using Bitnami are `KAFKA_ADVERTISED_LISTENERS` and `KAFKA_LISTENERS`. (Actually, for **Bitnami** only, a recent update changed these to be `KAFKA_CFG_LISTENERS` and `KAFKA_CFG_ADVERTISED_LISTENERS`.) Please leave a comment if you know differently. However for anyone trying to use `KAFKA_HOST_NAMES` and finding this doesn't work, try this intead. – FreelanceConsultant May 02 '23 at 11:45
6

This line

KAFKA_ADVERTISED_HOST_NAME: localhost

Says the broker is advertising itself as being available only on localhost, which means all Kafka clients would only get back itself, not the actual list of real broker addresses. This would be fine if your clients are only located on your host - requests always go to localhost, which is forwarded to the container.

But, for apps in other containers, they need to point at the Kafka container, so it should say KAFKA_ADVERTISED_HOST_NAME: kafka, where kafka here is the name of the Docker Compose Service. Then clients in other containers would try to connect to that one


That being said, then, this line

consumer = KafkaConsumer('test', bootstrap_servers='localhost:9092')

You are pointing the Python container at itself, not the kafka container.

It should say kafka:9092 instead

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • If I am accessing the kafka server from a different network, would I have to change "KAFKA_ADVERTISED_HOST_NAME: PLAINTEXT://ipaddress:9092, BROKER://127.0.0.1:9092" ? That way the external app would be redirected to the server ip address? – Danielle Paquette-Harvey Mar 08 '21 at 19:40
  • 1
    @Danielle That's correct, except not using the hostname config, but advertised listeners if you're advertising multiple addresses – OneCricketeer Mar 08 '21 at 20:36
  • @OneCricketeer, what do you mean with "...not using the hostname config, but advertised listeners..."? Do you mean we should use IP address instead of hostname? – Iwan Satria Jun 24 '21 at 14:44
  • Ah, okay... I think you meant that Danielle should've used 'KAFKA_ADVERTISED_LISTENERS' instead of 'KAFKA_ADVERTISED_HOST_NAME' here since the values he's using include not just the hostnames, but also the port numbers. – Iwan Satria Jun 24 '21 at 14:56
  • 1
    @IwanSatria That is correct - IMO, always use advertised listeners. Worth mentioning that the advertised host name (and port) properties themselves are deprecated in favor of the `advertised.listeners` broker config – OneCricketeer Jun 24 '21 at 17:06
1

In my case I wanted to access the Kafka Container from an external python client running locally (as a producer) and here is the combination of containers and python code that worked for me (Platform MAC OS and docker version 2.4.0):

zookeeper container:

docker run -d \
-p 2181:2181 \
--name=zookeeper \
-e ZOOKEEPER_CLIENT_PORT=2181 \
confluentinc/cp-zookeeper:5.2.3

kafka container:

docker run -d \ 
-p 29092:29092 \ 
-p 9092:9092 \ 
--name=kafka \ 
-e KAFKA_ZOOKEEPER_CONNECT=host.docker.internal:2181 \ 
-e KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=BROKER:PLAINTEXT,PLAINTEXT:PLAINTEXT \ 
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:29092,BROKER://localhost:9092 \ 
-e KAFKA_INTER_BROKER_LISTENER_NAME=BROKER \ 
-e KAFKA_BROKER_ID=1 \ 
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \ 
-e KAFKA_CREATE_TOPICS="test:1:1" \ 
confluentinc/cp-enterprise-kafka:5.2.3

python client:

from kafka import KafkaProducer
import json
producer = KafkaProducer(bootstrap_servers=['localhost:29092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8'),
security_protocol='PLAINTEXT')
acc_ini = 523416
print("Sending message")
producer.send('test', {'model_id': '1','acc':str(acc_ini), 'content':'test'})
producer.flush()
Aroca
  • 11
  • 2
  • `host.docker.internal` is incorrect as you should be using a `--network` and connecting to `zookeeper:2181` (which is what using a Compose file would do). Also, Confluent images don't use `KAFKA_CREATE_TOPICS` env-var, and your internal listener should really be `PLAINTEXT` to be accurately configured inside the docker network – OneCricketeer Jan 21 '21 at 05:08
  • Plus, this doesn't answer the actual question about the Python client in the Docker container – OneCricketeer Jan 21 '21 at 05:10
  • Thanks for your comments @OneCricketeer. Sure your suggestions also work but mine worked too, at least for my purpose, which was connecting an external python consumer with a Kafka container. I explained that the first thing before posting the code... Since I ended at this thread looking for solutions to my problem (external python consumer), I thought that there might be other people with my same problem ending also here. If you find any other thread where this comment is more appropriate, let me know. I did not find it. – Aroca Jan 22 '21 at 07:28
  • I frequently close posts for this one https://stackoverflow.com/questions/51630260/connect-to-kafka-running-in-docker Client language doesn't matter. There's many blogs on it, too. https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/ – OneCricketeer Jan 22 '21 at 14:37
0

In my local setup, I had the same issue where all the containers were working fine inside the docker but the connectors were unable to connect to the Kafka.

My Kafka configuration inside docker-compose.yml:

  broker:
    image: confluentinc/cp-server:7.0.1
    hostname: broker
    container_name: broker
    depends_on:
      - zookeeper
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENERS: INTERNAL://:29092,EXTERNAL://:9092
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://:29092,EXTERNAL://:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL

Based on google responses I tried localhost:port, 127.0.0.1:port, and 0.0.0.0:port as the URL in my connector config but nothing worked.

Eventually, I had to pass the actual IP address of my local system and it worked. I'm using Docker on Windows 10.

ActiveMQ Connector congif:

{
  "connector.class": "io.confluent.connect.activemq.ActiveMQSourceConnector",
  "activemq.url": "tcp://<my-system-ip>:61616", <-- my system's IP address
  "max.poll.duration": "60000",
  "tasks.max": "1",
  "batch.size": "1",
  "name": "activemq-jms-connector",
  "jms.destination.name": "jms-test",
  "kafka.topic": "topic-1",
  "activemq.password": "password",
  "jms.destination.type": "topic",
  "use.permissive.schema": "false",
  "activemq.username": "username"
}

I hope this will help anyone struggling with the Kafka-connect setup on Windows and Docker.

Sandeep Kumar
  • 2,397
  • 5
  • 30
  • 37