I have a Kafka cluster in Kubernetes created using Strimzi.
apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
name: {{ .Values.cluster.kafka.name }}
spec:
kafka:
version: 2.7.0
replicas: 3
storage:
deleteClaim: true
size: {{ .Values.cluster.kafka.storagesize }}
type: persistent-claim
rack:
topologyKey: failure-domain.beta.kubernetes.io/zone
template:
pod:
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9404'
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
authentication:
type: tls
- name: external
port: 9094
type: loadbalancer
tls: true
authentication:
type: tls
configuration:
bootstrap:
loadBalancerIP: {{ .Values.cluster.kafka.bootstrapipaddress }}
brokers:
{{- range $key, $value := (split "," .Values.cluster.kafka.brokersipaddress) }}
- broker: {{ (split "=" .)._0 }}
loadBalancerIP: {{ (split "=" .)._1 | quote }}
{{- end }}
authorization:
type: simple
Cluster is created and up, I am able to create topics and produce/consume to/from topic. The issue is that if I exec into one of Kafka brokers pods I see intermittent errors
INFO [SocketServer brokerId=0] Failed authentication with /10.240.0.35 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [data-plane-kafka-network-thread-0-ListenerName(EXTERNAL-9094)-SSL-9]
INFO [SocketServer brokerId=0] Failed authentication with /10.240.0.159 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [data-plane-kafka-network-thread-0-ListenerName(EXTERNAL-9094)-SSL-11]
INFO [SocketServer brokerId=0] Failed authentication with /10.240.0.4 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [data-plane-kafka-network-thread-0-ListenerName(EXTERNAL-9094)-SSL-10]
INFO [SocketServer brokerId=0] Failed authentication with /10.240.0.128 (SSL handshake failed) (org.apache.kafka.common.network.Selector) [data-plane-kafka-network-thread-0-ListenerName(EXTERNAL-9094)-SSL-1]
After inspecting these IPs [10.240.0.35, 10.240.0.159, 10.240.0.4,10.240.0.128] I figured out the all they are related to pods from kube-system namespace which are implicitly created as part of Kafka cluster deployment.
Any idea what can be wrong?