0

I have been struggling with the deployment of kafka as a service in a gitlab job, especially when it comes to exposing ports. I have a node app running tests that need a kafka broker as well as a kafka registry, when running locally I have no issue with the following docker compose file:

version: "3"
services:
  zookeeper:
    image: 'bitnami/zookeeper:latest'
    ports:
      - '2181:2181'
    environment:
      - ALLOW_ANONYMOUS_LOGIN=yes
    networks:
      - app-tier
  kafka:
    image: 'bitnami/kafka:latest'
    ports:
      - '9092:9092'
    environment:
      - KAFKA_BROKER_ID=1
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092
      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
      - ALLOW_PLAINTEXT_LISTENER=yes
    depends_on:
      - zookeeper
    networks:
      - app-tier

  kafka-shema-registry:
    networks:
      - app-tier
    container_name: kafka-schema-registry
    image: bitnami/schema-registry:7.2.5-debian-11-r23
    ports:
      - 8081:8081
    environment:
      - SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081
      - SCHEMA_REGISTRY_KAFKA_BROKERS=PLAINTEXT://kafka:9092

networks:
  app-tier:
    driver: bridge

zookeeper, kafka and the registry all communicate with each other on the docker network app-tier while exposing the ports I need on my localhost, then in my app I connect to the broker and the registry using localhost:9092 and http://localhost:8081.

Now I would like to run my tests on my CI, in order to do that I would like to deploy kafka as a service. My understanding of this article https://docs.gitlab.com/ee/ci/services/ is that ports are exposed from the container by default but I could not get node to connect to the broker.

Here is my ci file:

---
image: node:18.2.0

stages:
  - test

unit-test:
  allow_failure: true
  stage: test
  script:
    - apt-get update
    - apt-get install -y lsof
    - cat /etc/hosts
    - netstat
    - echo waiting for kafka broker to start
    - sleep 10
    - npm run test
  variables:
    CI_DEBUG_SERVICES: "true"
    KAFKA_BROKER_ID : "1"
    KAFKA_CFG_LISTENERS: "PLAINTEXT://:9092,PLAINTEXT://:9094,CONTROLLER://:9093"
    KAFKA_CFG_ADVERTISED_LISTENERS: "PLAINTEXT://127.0.0.1:9092,PLAINTEXT://kafka:9094,CONTROLLER://kafka:9093"
    KAFKA_CFG_ZOOKEEPER_CONNECT: "zookeeper:2181"
    ALLOW_PLAINTEXT_LISTENER: "yes"
    ALLOW_ANONYMOUS_LOGIN: "yes"
    SCHEMA_REGISTRY_LISTENERS: "http://0.0.0.0:8081"
    SCHEMA_REGISTRY_KAFKA_BROKERS: "PLAINTEXT://kafka:9094"
    KAFKA_BROKER_URL: "kafka:9092"
    REGISTRY_HOST_URL: "http://kafka:8081"
    DEBUG: "intent:kafka"
  services:
    - alias: 'zookeeper'
      name: 'bitnami/zookeeper:latest'
    - alias: 'kafka'
      name: 'bitnami/kafka:latest'
    - alias: 'kafka-shema-registry'
      name: bitnami/schema-registry:7.2.5-debian-11-r23
  tags:
    - powerfull

before_script:
  - npm ci --cache .npm --prefer-offline

I have tried connecting using kafka:9094, kafka:9092, localhost:9092 but always get an output like this one:

Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092

This is the output from npm run test:

    [0KRunning with gitlab-runner 15.11.0 (436955cb)[0;m
    [0K  on gitlab-runner-ram-fbfcb6b7d-d424g KkQkDykg, system ID: r_LUPMRVDAr2J9[0;m
    section_start:1689752965:prepare_executor
    [0K[0K[36;1mPreparing the "kubernetes" executor[0;m[0;m
    [0KUsing Kubernetes namespace: gitlab-runner[0;m
    [0KUsing Kubernetes executor with image node:18.2.0 ...[0;m
    [0KUsing attach strategy to execute scripts...[0;m
    section_end:1689752965:prepare_executor
    [0Ksection_start:1689752965:prepare_script
    [0K[0K[36;1mPreparing environment[0;m[0;m
    Waiting for pod gitlab-runner/runner-kkqkdykg-project-25355260-concurrent-08szm2 to be running, status is Pending
    Waiting for pod gitlab-runner/runner-kkqkdykg-project-25355260-concurrent-08szm2 to be running, status is Pending
        ContainersNotReady: "containers with unready status: [build helper svc-0 svc-1 svc-2]"
        ContainersNotReady: "containers with unready status: [build helper svc-0 svc-1 svc-2]"
    [0;37m[service:bitnami/zookeeper-zookeeper] 2023-07-19T07:49:31.236228815Z [38;5;6mzookeeper [38;5;5m07:49:31.23 [0m[0;m
    [0;37m[service:bitnami/zookeeper-zookeeper] 2023-07-19T07:49:31.238723714Z [38;5;6mzookeeper [38;5;5m07:49:31.23 [0m[1mWelcome to the Bitnami zookeeper container[0m[0;m
...
[0;37m[service:bitnami/zookeeper-zookeeper] 2023-07-19T07:49:34.728653661Z 2023-07-19 07:49:34,721 [myid:1] - INFO  [main:o.a.z.s.RequestThrottler@75] - zookeeper.request_throttler.shutdownTimeout = 10000 ms[0;m
[0;37m[service:bitnami/zookeeper-zookeeper] 2023-07-19T07:49:34.812561076Z 2023-07-19 07:49:34,811 [myid:1] - INFO  [main:o.a.z.s.ContainerManager@84] - Using checkIntervalMs=60000 maxPerMinute=10000 maxNeverUsedIntervalMs=0[0;m
[0;37m[service:bitnami/zookeeper-zookeeper] 2023-07-19T07:49:34.815225626Z 2023-07-19 07:49:34,814 [myid:1] - INFO  [main:o.a.z.a.ZKAuditProvider@42] - ZooKeeper audit is disabled.[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.029867092Z [38;5;6mkafka [38;5;5m07:49:32.02 [0m[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.033027585Z [38;5;6mkafka [38;5;5m07:49:32.03 [0m[1mWelcome to the Bitnami kafka container[0m[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.035939317Z [38;5;6mkafka [38;5;5m07:49:32.03 [0mSubscribe to project updates by watching [1mhttps://github.com/bitnami/containers[0m[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.039900140Z [38;5;6mkafka [38;5;5m07:49:32.03 [0mSubmit issues and feature requests at [1mhttps://github.com/bitnami/containers/issues[0m[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.043292084Z [38;5;6mkafka [38;5;5m07:49:32.04 [0m[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.046177659Z [38;5;6mkafka [38;5;5m07:49:32.04 [0m[38;5;2mINFO [0m ==> ** Starting Kafka setup **[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.180679312Z [38;5;6mkafka [38;5;5m07:49:32.17 [0m[38;5;3mWARN [0m ==> You set the environment variable ALLOW_PLAINTEXT_LISTENER=yes. For safety reasons, do not use this flag in a production environment.[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.194714228Z [38;5;6mkafka [38;5;5m07:49:32.19 [0m[38;5;2mINFO [0m ==> Initializing Kafka...[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.227169967Z [38;5;6mkafka [38;5;5m07:49:32.21 [0m[38;5;2mINFO [0m ==> No injected configuration files found, creating default config files[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.561988494Z [38;5;6mkafka [38;5;5m07:49:32.56 [0m[38;5;2mINFO [0m ==> Initializing KRaft...[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:32.564819286Z [38;5;6mkafka [38;5;5m07:49:32.56 [0m[38;5;3mWARN [0m ==> KAFKA_KRAFT_CLUSTER_ID not set - If using multiple nodes then you must use the same Cluster ID for each one[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:35.772983910Z [38;5;6mkafka [38;5;5m07:49:35.77 [0m[38;5;2mINFO [0m ==> Generated Kafka cluster ID 'KNim1LQkQZSeg1uhzPG3XA'[0;m
[0;37m[service:bitnami/kafka-kafka] 2023-07-19T07:49:35.779592721Z [38;5;6mkafka [38;5;5m07:49:35.77 [0m[38;5;2mINFO [0m ==> Formatting storage directories to add metadata...[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.905123007Z [38;5;6mschema-registry [38;5;5m07:49:32.90 [0m[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.908382287Z [38;5;6mschema-registry [38;5;5m07:49:32.90 [0m[1mWelcome to the Bitnami schema-registry container[0m[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.912092156Z [38;5;6mschema-registry [38;5;5m07:49:32.91 [0mSubscribe to project updates by watching [1mhttps://github.com/bitnami/containers[0m[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.913591468Z [38;5;6mschema-registry [38;5;5m07:49:32.91 [0mSubmit issues and feature requests at [1mhttps://github.com/bitnami/containers/issues[0m[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.919497289Z [38;5;6mschema-registry [38;5;5m07:49:32.91 [0m[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.922864592Z [38;5;6mschema-registry [38;5;5m07:49:32.92 [0m[38;5;2mINFO [0m ==> ** Starting Schema Registry setup **[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:32.948460379Z [38;5;6mschema-registry [38;5;5m07:49:32.94 [0m[38;5;2mINFO [0m ==> Validating settings in SCHEMA_REGISTRY_* env vars[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:33.018414124Z [38;5;6mschema-registry [38;5;5m07:49:33.01 [0m[38;5;2mINFO [0m ==> Initializing Schema Registry[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:33.021856480Z realpath: /bitnami/schema-registry/etc: No such file or directory[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:33.023845679Z [38;5;6mschema-registry [38;5;5m07:49:33.02 [0m[38;5;2mINFO [0m ==> No injected configuration files found, creating config file based on SCHEMA_REGISTRY_* env vars[0;m
[0;37m[service:bitnami/schema-registry-kafka-shema-registry] 2023-07-19T07:49:33.075834475Z [38;5;6mschema-registry [38;5;5m07:49:33.07 [0m[38;5;2mINFO [0m ==> Waiting for Kafka brokers to be up[0;m
Running on runner-kkqkdykg-project-25355260-concurrent-08szm2 via gitlab-runner-ram-fbfcb6b7d-d424g...

section_end:1689752977:prepare_script
[0Ksection_start:1689752977:get_sources
[0K[0K[36;1mGetting source from Git repository[0;m[0;m
[32;1mFetching changes with git depth set to 50...[0;m
Initialized empty Git repository in /builds/intent-technologies/back/npm_modules/kafka/.git/
[32;1mCreated fresh repository.[0;m
[32;1mChecking out 531c8b60 as detached HEAD (ref is 3-migration-kafka-js)...[0;m

[32;1mSkipping Git submodules setup[0;m

section_end:1689752980:get_sources
[0Ksection_start:1689752980:step_script
[0K[0K[36;1mExecuting "step_script" stage of the job script[0;m[0;m
[32;1m$ npm ci --cache .npm --prefer-offline[0;m
npm WARN deprecated stringify-package@1.0.1: This module is not used anymore, and has been replaced by @npmcli/package-json

> @intent/kafka@3.0.0 prepare
> npx husky install

npm WARN exec The following package was not found and will be installed: husky
husky - Git hooks installed

added 501 packages, and audited 502 packages in 14s

67 packages are looking for funding
  run `npm fund` for details

2 high severity vulnerabilities

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.
[32;1m$ cat /etc/hosts[0;m
# Kubernetes-managed hosts file.
127.0.0.1   localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.2.141.112    runner-kkqkdykg-project-25355260-concurrent-08szm2

# Entries added by HostAliases.
127.0.0.1   bitnami-zookeeper   zookeeper   bitnami-kafka   kafka   bitnami-schema-registry kafka-shema-registry
[32;1m$ echo waiting for kafka broker to start[0;m
waiting for kafka broker to start
[32;1m$ sleep 10[0;m
[32;1m$ npm run test[0;m

> @intent/kafka@3.0.0 test
> node -r tsconfig-paths/register -r ts-node/register bin/test/test.ts


Local Schema (src/__tests__/local.spec.ts)
BROKER HOST is kafka:9092
  ✖ it should produce and consume a message with a valid payload (708ms)
BROKER HOST is kafka:9092
  ✖ it should fail when producing a message with an invalid payload (903ms)

Registry (src/__tests__/registry.spec.ts)
Connecting to registry
2023-07-19T07:50:09.583Z intent:kafka Registering kafka schema my-topic
  ✖ it should produce and consume a message with a valid payload (121ms)
Connecting to registry
2023-07-19T07:50:09.699Z intent:kafka Registering kafka schema my-topic
  ✖ it should fail when producing a message with an invalid payload (31ms)

Timeouts (src/__tests__/timeout.spec.ts)
  ✖ it should timeout while consuming a message (988ms)
  ✖ it should send heartbeat while consuming a message (1s)
  ✖ it should automatically send heartbeat while consuming a message (714ms)

 FAILED 

Tests   : 7 failed (7)
Time    : 5s


✖ it should produce and consume a message with a valid payload
  (Setup hook)

   KafkaJSNumberOfRetriesExceeded:
Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092
 

   ⁃ Socket.onError
     node_modules/kafkajs/src/network/connection.js:210


✖ it should fail when producing a message with an invalid payload
  (Setup hook)

   KafkaJSNumberOfRetriesExceeded:
Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092
 

   ⁃ Socket.onError
     node_modules/kafkajs/src/network/connection.js:210


✖ it should produce and consume a message with a valid payload
  (Setup hook)

   ResponseError:
Confluent_Schema_Registry
-
Error,
status
400:
connect
ECONNREFUSED
127.0.0.1:8081
 

   ⁃ anonymous
     node_modules/@kafkajs/confluent-schema-registry/src/api/middleware/errorMiddleware.ts:33


✖ it should fail when producing a message with an invalid payload
  (Setup hook)

   ResponseError:
Confluent_Schema_Registry
-
Error,
status
400:
connect
ECONNREFUSED
127.0.0.1:8081
 

   ⁃ anonymous
     node_modules/@kafkajs/confluent-schema-registry/src/api/middleware/errorMiddleware.ts:33


✖ it should timeout while consuming a message
  (Setup hook)

   KafkaJSNumberOfRetriesExceeded:
Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092
 

   ⁃ Socket.onError
     node_modules/kafkajs/src/network/connection.js:210


✖ it should send heartbeat while consuming a message
  (Setup hook)

   KafkaJSNumberOfRetriesExceeded:
Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092
 

   ⁃ Socket.onError
     node_modules/kafkajs/src/network/connection.js:210


✖ it should automatically send heartbeat while consuming a message
  (Setup hook)

   KafkaJSNumberOfRetriesExceeded:
Connection
error:
connect
ECONNREFUSED
127.0.0.1:9092
 

   ⁃ Socket.onError
     node_modules/kafkajs/src/network/connection.js:210



section_end:1689753013:step_script
[0Ksection_start:1689753013:cleanup_file_variables
[0K[0K[36;1mCleaning up project directory and file based variables[0;m[0;m

section_end:1689753014:cleanup_file_variables
[0K[31;1mERROR: Job failed: command terminated with exit code 1
[0;m

Potentials issue I see here is that kafka is 127.0.0.1 on the network but I am not sure what to put in ADVERTISED_LISTENERS (see the result of cat /etc/hosts). Another one is that my runner is in kube, I have no idea if this could have an impact. Thanks for your help.

Zillon
  • 73
  • 8
  • Another potential issue: as there is no way to control if a service is properly started the job may be starting too soon after the services are started. – Zillon Jul 19 '23 at 15:24

1 Answers1

1
  1. Never edit /etc/hosts on your own. This also shouldn't be where you debug the issue.

  2. When you set KAFKA_CFG_ADVERTISED_LISTENERS: "PLAINTEXT://127.0.0.1:9092 and have connections to kafka:9092, this will "work" for the initial connection, but then kafka service will return 127.0.0.1 and cause the Kafka Client (NodeJS container) to connect to itself, not the broker. Therefore, kafka:9094 is correct in all places, and you should really remove the 127.0.0.1 address since you don't have connections within the broker service that need to happen that cannot also use kafka:9094

https://www.confluent.io/blog/kafka-listeners-explained/

Connect to Kafka running in Docker

my runner is in kube

You should use a fully qualified service name, if possible - https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/

But you could also try Stimzi to run Kafka, rather than manually configure Bitnami containers (also, Zookeeper is no longer used for that image, by default)

Also, bitnami/kafka:latest has several bugs. Look at the issue tracker on Github for it

Other note - http://kafka:8081 should be the registry, not the broker service name

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thanks for your answer. I did not edit /etc/hosts, I just cat it in order to make sure kafka is properly resolved to something after starting the service in the ci. In the logs you can see it is resolved to 127.0.0.1. I tried using kafka:9094 but if I do the error in node is : ECONNREFUSED 127.0.0.1:9094 (I understand it is because because kafka is resolved to 127.0.01, that is why I tried many combinations around KAFKA_CFG_ADVERTISED_LISTENERS). Can you please point out the image you advise me to use? This one seems inactive for some times now https://hub.docker.com/r/strimzi/kafka – Zillon Jul 20 '23 at 08:10
  • The more I think about it the more it seems possible to me that the issue is because the ci does not wait for the proper initialization of the services before running the script part. For instance I do not understand why the logs from the services stop in the middle of their initialization. See here: https://gitlab.com/gitlab-org/gitlab/-/issues/30353 – Zillon Jul 20 '23 at 08:30
  • It's true the Kafka container takes up to a minute to start. But also, as mentioned, the bitnami container has its own set of issues that people keep reporting (logs come, but it just crashes/hangs without much detail and without setting `BITNAMI_DEBUG=true`, you'd not see the issue). Strimzi is used, but that's for Kubernetes Operator, and they pull from quay I think, not Dockerhub. In any case, loopback address shouldn't be used to connect to distinct containers on a network bridge – OneCricketeer Jul 20 '23 at 13:02
  • Alright thank you, I'll see if I get better results with a different image, especially one that does not rely on a running instance of zookeeper. I managed to start node in a container running on the same docker network as kafka, using kafka:9092 everywhere, but that just can't be done in the context of my CI. – Zillon Jul 20 '23 at 13:10
  • Yeah, I don't have Gitlab experience, but their docs seem to show using `Host: redis` or `postgres`, for example, so that's how you need to connect... But you also need to configure Kafka to return the correct value. Zookeeper isn't the issue, but it's also no longer needed (or used in your example) – OneCricketeer Jul 20 '23 at 19:37