6

I am learning Kafka and trying to use it with docker. I'm confused looking at the docker-compose files, so I wanted to ask my questions here.

In most examples, I see a config like this:

broker:
   image: confluentinc/cp-enterprise-kafka:5.3.1
   ...
   ports:
       - "29192:29092" 
       - "9192:9092"
   environment:
        ...
        KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092

I wanted to understand a few things related to this:

  1. Is there a particular convention for using 5-digit (ex: 29092) port vs 4-digit (ex: 9092) ports?
  2. Is there a specific relation between the number of configured brokers to the number of listeners mentioned in KAFKA_ADVERTISED_LISTENERS? In some places, like confluent's own example, I see only one broker, but multiple KAFKA_ADVERTISED_LISTENERS. What does that mean?
  3. Also, in the same confluent example mentioned above, there is a listener: PLAINTEXT://broker:29092 in KAFKA_ADVERTISED_LISTENERS, but we do not have any broker or zookeeper configured at port 29092, so what is that doing?
Somjit
  • 2,503
  • 5
  • 33
  • 60

2 Answers2

3

KAFKA_ADVERTISED_LISTENERS is a list of addresses to a particular broker. When a client first contacts the broker acting as "bootstrap server" it will get in return the addresses of brokers responsible for each partition. Clients will need different addresses depending on which network they come from. The "bootstrap server", knowing "advertised listeners" for all the brokers, will choose the appropriate addresses to return based on which of its own listeners the client used to connect.

TL;DR

Source: https://rmoff.net/2018/08/02/kafka-listeners-explained/


  1. Is there a particular convention for using 5-digit (ex: 29092) port vs 4-digit (ex: 9092) ports?

I believe the only convention is 9092 as the broker port. When multiple ports are needed people seem to use a variation of it. I've seen 9093, and now 29092.

  1. Is there a specific relation between the number of configured brokers to the number of listeners mentioned in KAFKA_ADVERTISED_LISTENERS? In some places, like confluent's own example, I see only one broker, but multiple KAFKA_ADVERTISED_LISTENERS. What does that mean?

There is no relation. Each broker has it's own KAFKA_ADVERTISED_LISTENERS list. It refers to that particular broker only, not the other brokers.

When a client connects to kafka, it first talks to a broker acting as "bootstrap server". That broker will then respond with metadata including the addresses of particular brokers the client should reach when talking about particular partitions. These addresses will depend on which network the client is coming from. E.g. localhost:9092 for clients in the docker host vs. broker:29092 for clients within docker network. That's why one broker can have multiple KAFKA_ADVERTISED_LISTENERS.

  1. Also, in the same confluent example mentioned above, there is a listener: PLAINTEXT://broker:29092 in KAFKA_ADVERTISED_LISTENERS, but we do not have any broker or zookeeper configured at port 29092, so what is that doing?

In the confluent example, that KAFKA_ADVERTISED_LISTENERS line is actually configuring the broker to listen on 29092. All clients internal to docker are using this port to reach the broker.

Btw, 29092 is not in the ports: mapping above as it's only used by internal clients (doesn't need to be exposed to a host port). Only 9092 needs to be exposed outside of docker in this example.

Victor Basso
  • 5,556
  • 5
  • 42
  • 60
1

Is there a particular convention for using 5-digit (ex: 29092) port vs 4-digit (ex: 9092) ports?

To avoid getting a port conflict at run time of an application, you should configure ports for an application out of ephemeral port range of server. Refer this question and answer to find , how can you find or modify the ephemeral port range. So the convention is, find a port out of ephemeral port range and use as an application port.

Is there a specific relation between the number of configured brokers to the number of listeners mentioned in KAFKA_ADVERTISED_LISTENERS? In some places, like confluent's own example, I see only one broker, but multiple KAFKA_ADVERTISED_LISTENERS. What does that mean?

Kafka can be deployed in a cluster within multiple nodes. Each node has its own IP and port number within config/server.properties file. Number of listeners and advertised listeners depends on the number of nodes within your Kafka deployment.

Also, in the same confluent example mentioned above, there is a listener: PLAINTEXT://broker:29092 in KAFKA_ADVERTISED_LISTENERS, but we do not have any broker or zookeeper configured at port 29092, so what is that doing?

Kafka need a Zookeeper deployment to manage its cluster management, irrespective of there is only one node of Kafka or multiple nodes. But it can use any port, and there is no hard and fast rule to use 29092 port.

Steephen
  • 14,645
  • 7
  • 40
  • 47
  • 1
    What you answered is a bit vague, even though i think the questions were pretty to-the-point . – Somjit Oct 08 '20 at 15:02