3

I am writing end-to-end tests for an application. I start an instance of an application, a Kafka instance, and a Zookeeper (all Dockerized) and then I interact with the application API to test its functionality. I need to test an event consumer's functionality in this application. I publish events from my tests and the application is expected to handle them.

Problem: If I run the application locally (not in Docker) and run tests that would produce events, the consumer in the application code handles events correctly. In this case, the consumer and the test have bootstrapServers set to localhost:9092. But if the application is run as a Dockerized instance it doesn't see the events. In this case bootstrapServers are set to kafka:9092 in the application and localhost:9092 in the test where kafka is a Docker container name. The kafka container exposes its 9092 port to the host so that the same instance of Kafka can be accessed from inside a Docker container and from the host (running my tests).

The only difference in the code is localhost vs kafka set as bootstrap servers. In both scenarios consumers and producers start successfully; events are published without errors. It is just that in one case the consumer doesn't receive events.

Question: How to make Dockerized consumers see events posted from the host machine?

Note: I have a properly configured Docker network which includes the application instance, Zookeeper, and Kafka. They all "see" each other. The corresponding ports of kafka and zookeeper are exposed to the host. Kafka ports: 0.0.0.0:9092->9092/tcp. Zookeeper ports: 22/tcp, 2888/tcp, 3888/tcp, 0.0.0.0:2181->2181/tcp.

I am using wurstmeister/kafka and wurstmeister/zookeeper Docker images (I cannot replace them).

Any ideas/thoughts are appreciated. How would you debug it?

UPDATE: The issue was with KAFKA_ADVERTISED_LISTENERS and KAFKA_LISTENERS env variables that were set to different ports for INSIDE and OUTSIDE communications. The solution was to use a correct port in the application code when it is run inside a Docker container.

Sasha Shpota
  • 9,436
  • 14
  • 75
  • 148
  • Can you post a minimal bash script / docker-compose file that sets up all the containers just like you have them so we can reproduce? – Omer Tuchfeld Oct 09 '20 at 16:12
  • @Omer the setup is more complex but I'll try to prepare an example. In the meantime, if you have any ideas/thoughts I would be happy to hear them. – Sasha Shpota Oct 09 '20 at 16:18
  • What is the OS of your host machine? And what version of Docker (*Client and Server/Docker Engine*) is installed on your host? Some things are simply pointless to even *think about* trying to troubleshoot without an [*MRE*](https://stackoverflow.com/help/minimal-reproducible-example). Docker containers is one of those things. Docker Compose, even more so. Any answer you get prior to sharing an MRE will just be a wild guess. If you want something better'n a wild guess, it behooves you to *do the needful* — „*…a properly configured docker network…*“ — Please elaborate on the details of that? TIA – deduper Oct 11 '20 at 22:00
  • Try replacing "localhost" with the actual ip address assigned to the container. 127.0.0.1, or 127.0.0.2...etc – z atef Oct 11 '20 at 23:26

4 Answers4

3

Thes kind of issues are usually related to the way Kafka handles the broker's address.

When you start a Kafka broker it binds itself on 0.0.0.0:9092 and register itself on Zookeeper with the address <hostname>:9092. When you connect with a client, Zookeeper will be contacted to fetch the address of the specific broker.

This means that when you start a Kafka container you have a situation like the following:

  • container name: kafka
  • network name: kafkanet
  • hostname: kafka
  • registration on zookeeper: kafka:9092

Now if you connect a client to your Kafka from a container inside the kafkanet network, the address you get back from Zookeeper is kafka:9092 which is resolvable through the kafkanet network.

However if you connect to Kafka from outside docker (i.e. using the localhost:9092 endpoint mapped by docker), you still get back the kafka:9092 address which is not resolvable.

In order to address this issue you can specify the advertised.host.name and advertised.port in the broker configuration in such a way that the address is resolvable by all the client (see documentation).

What is usually done is to set advertised.host.name as <container-name>.<network> (in your case something like kafka.kafkanet) so that any container connected to the network is able to correctly resolve the IP of the Kafka broker.

In your case however you have a mixed network configuration, as some components live inside docker (hence able to resolve the kafkanet network) while others live outside it. If it were a production system my suggestion would be to set the advertised.host.name to the DNS/IP of the host machine and always rely on docker port mapping to reach the Kafka broker.

From my understanding however you only need this setup to test things out, so the easiest thing would be to "trick" the system living outside docker. Using the naming specified above, this means simply to add to your /etc/hosts (or windows equivalent) the line 127.0.0.1 kafka.kafkanet.

This way when your client living outside docker connects to Kafka the following should happen:

  1. client -> Kafka via localhost:9092
  2. kafka queries Zookeeper and return the host kafka.kafkanet
  3. client resolves kafka.kafkanet to 127.0.0.1
  4. client -> Kafka via 127.0.0.1:9092

EDIT

As pointed out in a comment, newer Kafka version now use the concept of listeners and advertised.listeners which are used in place of host.name and advertised.host.name (which are deprecated and only used in case the the above ones are not specified). The general idea is the same however:

  • host.name: specifies the host to which the Kafka broker should bind itself to (works in conjunction with port
  • listeners: specifies all the endpoints to which the Kafka broker should bind (for instance PLAINTEXT://0.0.0.0:9092,SSL://0.0.0.0:9091)
  • advertised.host.name: specifies how the broker is advertised to client (i.e. which address client should use to connect to it)
  • avertised.listeners: specifies all the advertised endpoints (for instance PLAINTEXT://kafka.example.com:9092,SSL://kafka.example.com:9091)

In both cases for client to be able to successfully communicate with Kafka they need to be able to resolve and connect to the advertised hostname and port.

In both cases if not specified they are automatically derived by the broker using the hostname of the machine the broker is running on.

nivox
  • 2,060
  • 17
  • 18
1

You kept referencing 8092. Was that intentional? Kafka runs on 9092. Easiest test is to download the same version of Kafka and manually run its kafka-console-consumer and kafka-console-producer scripts to see if you can pub-sub from your host machine.

Mike Thomsen
  • 36,828
  • 10
  • 60
  • 83
  • 1
    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - [From Review](/review/low-quality-posts/27365747) – jnovack Oct 11 '20 at 23:52
  • Thank you (+1). The `8092` port is a mistake in the question. I have them all as `9092`. – Sasha Shpota Oct 12 '20 at 07:50
  • @jnovack, in your zeal to criticize my post you neglected to mention that I provided the requester with a solid method for debugging their particular setup. – Mike Thomsen Oct 12 '20 at 22:41
  • And I'm proud of you. But debugging is not an "answer", debugging is a comment. https://meta.stackoverflow.com/questions/316781/are-go-fix-debug-the-code-answers-not-an-answer – jnovack Oct 13 '20 at 12:34
0

did you try "host.docker.internal" in dockerized application?

Keaz
  • 955
  • 1
  • 11
  • 21
0

You could create a docker network for your containers and then containers will be able to resolve each other hostnames and communicate.

Note: this is usable with docker-compose as well with standalone containers

user2851729
  • 143
  • 1
  • 7