50

I setup a single node Kafka Docker container on my local machine like it is described in the Confluent documentation (steps 2-3).

In addition, I also exposed Zookeeper's port 2181 and Kafka's port 9092 so that I'll be able to connect to them from a client running on local machine:

$ docker run -d \
    -p 2181:2181 \
    --net=confluent \
    --name=zookeeper \
    -e ZOOKEEPER_CLIENT_PORT=2181 \
    confluentinc/cp-zookeeper:4.1.0

$ docker run -d \
    --net=confluent \
    --name=kafka \
    -p 9092:9092 \
    -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
    -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 \
    -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
    confluentinc/cp-kafka:4.1.0

Problem: When I try to connect to Kafka from the host machine, the connection fails because it can't resolve address: kafka:9092.

Here is my Java code:

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("client.id", "KafkaExampleProducer");
props.put("key.serializer", LongSerializer.class.getName());
props.put("value.serializer", StringSerializer.class.getName());
KafkaProducer<Long, String> producer = new KafkaProducer<>(props);
ProducerRecord<Long, String> record = new ProducerRecord<>("foo", 1L, "Test 1");
producer.send(record).get();
producer.flush();

The exception:

java.io.IOException: Can't resolve address: kafka:9092
    at org.apache.kafka.common.network.Selector.doConnect(Selector.java:235) ~[kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.common.network.Selector.connect(Selector.java:214) ~[kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:864) [kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:265) [kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:266) [kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:238) [kafka-clients-2.0.0.jar:na]
    at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:176) [kafka-clients-2.0.0.jar:na]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_144]
Caused by: java.nio.channels.UnresolvedAddressException: null
    at sun.nio.ch.Net.checkAddress(Net.java:101) ~[na:1.8.0_144]
    at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) ~[na:1.8.0_144]
    at org.apache.kafka.common.network.Selector.doConnect(Selector.java:233) ~[kafka-clients-2.0.0.jar:na]
    ... 7 common frames omitted

Question: How to connect to Kafka running in Docker? My code is running from host machine, not Docker.

Note: I know that I could theoretically play around with DNS setup and /etc/hosts but it is a workaround - it shouldn't be like that.

There is also similar question here, however it is based on ches/kafka image. I use confluentinc based image which is not the same.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Sasha Shpota
  • 9,436
  • 14
  • 75
  • 148
  • Pretty sure that this is only working between docker containers all setup with networking like this. you're essentially creating a separate network (confluent) here, and the two containers (zookeeper and kafka) can talk to each other, but you cannot access it from outside directly with localhost. i think it works if you use /etc/hosts, but i'm not sure. it wouldn't be a workaround, though, because the containers are not running on localhost. they're running on confluent network. does it work if you specify the ip address instead of localhost? – Marius Waldal Aug 01 '18 at 12:00

6 Answers6

82

tl;dr - A simple port forward from the container to the host will not work... Hosts files (e.g. /etc/hosts on *NIX systems) should not be modified to work around Kafka networking, as this solution is not portable.

1) What exact IP/hostname + port do you want to connect to? Make sure that value is set as advertised.listeners (not advertised.host.name and advertised.port, as these are deprecated) on the broker.

2) Make sure that the server(s) listed as part of bootstrap.servers are actually resolvable. E.g ping an IP/hostname, use netcat to check ports... If your clients are in a container, you need to do this from the container, not only your host

3) To verify the ports are mapped correctly on the host, ensure that docker ps shows the kafka container is mapped from 0.0.0.0:<host_port> -> <advertised_listener_port>/tcp. The ports must match if trying to run a client from outside the Docker network.


The below answer uses confluentinc docker images to address the question that was asked, not wurstmeister/kafka. More specifically, the latter images are not well-maintained despite being the one of the most popular Kafka docker image. If you have KAFKA_ADVERTISED_HOST_NAME variable set, remove it (it's a deprecated property)

The following sections try to aggregate all the details needed to use another image. For other, commonly used Kafka images, it's all the same Apache Kafka running in a container.
You're just dependent on how it is configured. And which variables make it so.

wurstmeister/kafka

Refer their README section on listener configuration, Also read their Connectivity wiki.

bitnami/kafka

If you want a small container, try these. The images are much smaller than the Confluent ones and are much more well maintained than wurstmeister. Refer their README for listener configuration.

debezium/kafka

Docs on it are mentioned here.

Note: advertised host and port settings are deprecated. Advertised listeners covers both. Similar to the Confluent containers, Debezium can use KAFKA_ prefixed broker settings to update its properties.

Others

  • ubuntu/kafka requires you to add --override advertised.listeners=kafka:9092 via Docker image args... I find that less portable than environment variables, so not recommended
  • spotify/kafka is deprecated and outdated.
  • fast-data-dev or lensesio/box are great for an all in one solution, with Schema Registry, Kafka Connect, etc, but are bloated if you only want Kafka. Plus, it's a Docker anti pattern to run many services in one container
  • Your own Dockerfile - Why? Is something incomplete with these others? Start with a pull request, not starting from scratch.

For supplemental reading, a fully-functional docker-compose, and network diagrams, see this blog by @rmoff

Answer

The Confluent quickstart (Docker) document assumes all produce and consume requests will be within the Docker network.

You could fix the problem of connecting to kafka:9092 by running your Kafka client code within its own container as that uses the Docker network bridge, but otherwise you'll need to add some more environment variables for exposing the container externally, while still having it work within the Docker network.

First add a protocol mapping of PLAINTEXT_HOST:PLAINTEXT that will map the listener protocol to a Kafka protocol

Key: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
Value: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT

Then setup two advertised listeners on different ports. (kafka here refers to the docker container name; it might also be named broker, so double check your service + hostnames).

Key: KAFKA_ADVERTISED_LISTENERS
Value: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092

Notice the protocols here match the left-side values of the protocol mapping setting above

When running the container, add -p 29092:29092 for the host port mapping, and advertised PLAINTEXT_HOST listener.


So... (with the above settings)

If something still doesn't work, KAFKA_LISTENERS can be set to include <PROTOCOL>://0.0.0.0:<PORT> where both options match the advertised setting and Docker-forwarded port

Client on same machine, not in a container

Advertising localhost and the associated port will let you connect outside of the container, as you'd expect.

In other words, when running any Kafka Client outside the Docker network (including CLI tools you might have installed locally), use localhost:29092 for bootstrap servers and localhost:2181 for Zookeeper (requires Docker port forwarding)

Client on another machine

If trying to connect from an external server, you'll need to advertise the external hostname/ip (e.g. 192.168.x.y) of the host as well as/in place of localhost.
Simply advertising localhost with a port forward will not work because Kafka protocol will still continue to advertise the listeners you've configured.

This setup requires Docker port forwarding and router port forwarding (and firewall / security group changes) if not in the same local network, for example, your container is running in the cloud and you want to interact with it from your local machine.

Client (or another broker) in a container, on the same host

This is the least error-prone configuration; you can use DNS service names directly.

When running an app in the Docker network, use kafka:9092 (see advertised PLAINTEXT listener config above) for bootstrap servers and zookeeper:2181 for Zookeeper, just like any other Docker service communication (doesn't require any port forwarding)


If you use separate docker run commands, or Compose files, you need to define a shared network manually using compose networks section or docker network --create


See the example Compose file for the full Confluent stack or more minimal one for a single broker.

If using multiple brokers, then they need to use unique hostnames + advertised listeners. See example

Related question

Connect to Kafka on host from Docker (ksqlDB)

Appendix

For anyone interested in Kubernetes deployments:

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
13

When you first connect to a kafka node, it will give you back all the kafka node and the url where to connect. Then your application will try to connect to every kafka directly.

Issue is always what is the kafka will give you as url ? It's why there is the KAFKA_ADVERTISED_LISTENERS which will be used by kafka to tell the world how it can be accessed.

Now for your use-case, there is multiple small stuff to think about:

Let say you set plaintext://kafka:9092

  • This is OK if you have an application in your docker compose that use kafka. This application will get from kafka the URL with kafka that is resolvable through the docker network.
  • If you try to connect from your main system or from another container which is not in the same docker network this will fail, as the kafka name cannot be resolved.

==> To fix this, you need to have a specific DNS server like a service discovery one, but it is big trouble for small stuff. Or you set manually the kafka name to the container ip in each /etc/hosts

If you set plaintext://localhost:9092

  • This will be ok on your system if you have a port mapping ( -p 9092:9092 when launching kafka)
  • This will fail if you test from an application on a container (same docker network or not) (localhost is the container itself not the kafka one)

==> If you have this and wish to use a kafka client in another container, one way to fix this is to share the network for both container (same ip)

Last option : set an IP in the name: plaintext://x.y.z.a:9092 ( kafka advertised url cannot be 0.0.0.0 as stated in the doc https://kafka.apache.org/documentation/#brokerconfigs_advertised.listeners )

This will be ok for everybody... BUT how can you get the x.y.z.a name ?

The only way is to hardcode this ip when you launch the container: docker run .... --net confluent --ip 10.x.y.z .... Note that you need to adapt the ip to one valid ip in the confluent subnet.

wargre
  • 4,575
  • 1
  • 19
  • 35
  • Thank you (+1). The second option worked for me and probably I'll stick to it. I'll accept the answer later if none appears with a better solution (as you already mentioned we loose ability to establish connection from within the container). – Sasha Shpota Aug 01 '18 at 13:11
1

before zookeeper

  1. docker container run --name zookeeper -p 2181:2181 zookeeper

after kafka

  1. docker container run --name kafka -p 9092:9092 -e KAFKA_ZOOKEEPER_CONNECT=192.168.8.128:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://ip_address_of_your_computer_but_not_localhost!!!:9092 -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 confluentinc/cp-kafka

in kafka consumer and producer config

@Bean
public ProducerFactory<String, String> producerFactory() {
    Map<String, Object> configProps = new HashMap<>();
    configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.8.128:9092");
    configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
    return new DefaultKafkaProducerFactory<>(configProps);
}

@Bean
public ConsumerFactory<String, String> consumerFactory() {
    Map<String, Object> props = new HashMap<>();
    props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.8.128:9092");
    props.put(ConsumerConfig.GROUP_ID_CONFIG, "group_id");
    props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    return new DefaultKafkaConsumerFactory<>(props);
}

I run my project with these regulations. Good luck dude.

  • `ip_address_of_your_computer_but_not_localhost` ... Localhost works fine, if you refer my answer... And Compose would be better than `docker run` – OneCricketeer Aug 06 '20 at 18:14
  • 1
    Should not be localhost. Because you have to think of your containers as an external system.That's why you should point it to the computer's ip address, not its localhost. – İbrahim Ersin Yavaş Aug 10 '20 at 13:44
  • You can port forward from the container, then access from your host on `localhost`. Did you try the settings listed in my answer? Or read https://rmoff.net/2018/08/02/kafka-listeners-explained/ ? – OneCricketeer Aug 10 '20 at 15:52
0

This allows me to access localhost:9092 in Kafka applications on my M1 Mac

Key: KAFKA_ADVERTISED_LISTENERS
Value: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092

plus port forwarding :

ports
   - "9092:9092"

Finally, again, for my set up, I have to set listeners key this way

Key: KAFKA_LISTENERS
Value: PLAINTEXT://0.0.0.0:29092,PLAINTEXT_HOST://0.0.0.0:9092
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Ton
  • 101
  • 1
  • 5
  • This doesn't really add anything to the accepted answer. "M1 Mac" doesn't matter since all Docker hosts would act the same. Also port 9092 is being used "from the docker" here... – OneCricketeer Jul 21 '22 at 18:19
0

the simplest way to solve this is adding a custom hostname to your broker using -h option

docker run -d \
    --net=confluent \
    --name=kafka \
    -h broker-1 \
    -p 9092:9092 \
    -e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
    -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 \
    -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
    confluentinc/cp-kafka:4.1.0

and edit your /etc/hosts

127.0.0.1   broker-1

and use:

props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker-1:9092");
Adán Escobar
  • 1,729
  • 9
  • 15
  • 1
    No. Please do not edit `/etc/hosts`. With this code, you still get an error that `UnknownHost: kafka`... Just use `127.0.0.1:9092` here, and set `KAFKA_ADVERTISED_LISTENERS` accordingly to `PLAINTEXT://localhost:9092` – OneCricketeer Oct 18 '22 at 07:35
0
      KAFKA_BROKER_ID: 1
      KAFKA_ADVERTISED_HOST_NAME: kafka:9092
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

This configuration works fine

make sure from inside docker you are connecting kafka:29092

from outside container localhost:9092

full working docker compose config

    version: "3.3"

    services:
      zookeeper:
        image: confluentinc/cp-zookeeper:6.2.0
        container_name: zookeeper
        networks:
          - broker-kafka
        ports:
          - "2181:2181"
        environment:
          ZOOKEEPER_CLIENT_PORT: 2181
          ZOOKEEPER_TICK_TIME: 2000
          ALLOW_ANONYMOUS_LOGIN: yes
        volumes:
          - ./bitnami/zookeeper:/bitnami/zookeeper
    
      kafka:
        image: confluentinc/cp-kafka:6.2.0
        container_name: kafka
        networks:
          - broker-kafka
        depends_on:
          - zookeeper
        ports:
          - "9092:9092"
        expose:
          - "9092"
        environment:
          KAFKA_BROKER_ID: 1
          KAFKA_ADVERTISED_HOST_NAME: kafka:9092
          KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
          KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
          KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
          KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
          KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
          # KAFKA_AUTO_CREATE_TOPICS_ENABLE: "false"
          # KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
          # KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
          KAFKA_CONFLUENT_BALANCER_TOPIC_REPLICATION_FACTOR: 1
          # KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
          KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
        volumes:
          - ./bitnami/kafka:/bitnami/kafka
    
      kafdrop:
        image: obsidiandynamics/kafdrop
        container_name: kafdrop
        ports:
          - "9000:9000"
        expose:
          - "9000"
        networks:
          - broker-kafka
        environment:
          KAFKA_BROKERCONNECT: "PLAINTEXT://kafka:29092"
          JVM_OPTS: "-Xms16M -Xmx48M -Xss180K -XX:-TieredCompilation -XX:+UseStringDeduplication -noverify"
          SPRING_PROFILES_ACTIVE: "dev"
        depends_on:
          - kafka
          - zookeeper
    
      consumer:
        container_name: consumer
        build:
          context: ./consumer
          dockerfile: Dockerfile
        environment:
          - KAFKA_TOPIC_NAME=app
          - KAFKA_SERVER=kafka
          - KAFKA_PORT=29092
        ports:
          - 8001:8001
        restart: "always"
        depends_on:
          - zookeeper
          - kafka
          - publisher
          - kafdrop
        networks:
          - broker-kafka
    
      publisher:
        container_name: publisher
        build:
          context: ./producer
          dockerfile: Dockerfile
        environment:
          - KAFKA_TOPIC_NAME=app
          - KAFKA_SERVER=kafka
          - KAFKA_PORT=29092
        ports:
          - 8000:8000
        restart: "always"
        depends_on:
          - zookeeper
          - kafka
          - kafdrop
        networks:
          - broker-kafka
        volumes:
          - ./testproducer:/producer
    
    networks:
      broker-kafka:
        driver: bridge
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Tanjin Alam
  • 1,728
  • 13
  • 15
  • Your container data isn't being preserved here, by the way. Also, have you tested this with connecting from a different machine? – OneCricketeer May 22 '23 at 13:46
  • @OneCricketeer I am trying to install/setup Kafka on my Mac using the above docker-compose.yml, but getting error: unable to prepare context: path "./producer" not found I also had to remove 4 spaces to get the yml to start working – kamal Jun 10 '23 at 17:24
  • @OneCricketeer Created a directory ./producer but now getting error: failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to read dockerfile: open /var/lib/docker/tmp/buildkit-mount238317051/Dockerfile: no such file or directory – kamal Jun 11 '23 at 02:34
  • @kamal This isn't my answer. You can find plenty of existing Kafka compose files on the web, but as the error says, you have no producer folder next to your own compose file – OneCricketeer Jun 11 '23 at 11:54