I want to monitor my Kafka deployed using Strimzi operator. I deployed Prometheus operator alongside a few basic monitoring tools using community chart https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack, then followed Strimzi documentation https://strimzi.io/docs/operators/latest/deploying.html#assembly-metrics-str to configure kafka exporter and other metrics to being exposed. Strimzi examples have pod monitors cluster-operator-metrics
, entity-operator-metrics
and kafka-resources-metrics
. Everything discovered and scrapped successfully except one pod monitor cluster-operator-metrics
which is even now showed up in Prometheus instance.
I have following architecture:
- My strimzi operator deployed into
strimzi-system
namespace and watches for namespaceA
- Kafka brokers, exporters, entity operator deployed into namespace
A
- Kube prometheus stack chart (with prometheus operator) deployed into
monitoring
namespace - I have deployed second prometheus instance into namespace
A
to configure different retention and collect metrics in another place. This prometheus instance successfully discovered targets forentity-operator-metrics
andkafka-resources-metrics
entity-operator-metrics
andkafka-resources-metrics
pod monitors deployed to namespace Acluster-operator-metrics
deployed tostrimzi-system
namespace
I want to scrape metrics for cluster-operator-metrics
deployed in strimzi-system
namespace by prometheus deployed in monitoring
namespace.
Here is manifests which I'm using:
Prometheus instance deployed to monitoring
namespace
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
annotations:
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2023-02-06T10:29:49Z"
generation: 7
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 45.0.0
chart: kube-prometheus-stack-45.0.0
heritage: Helm
release: prometheus
name: prometheus-kube-prometheus-prometheus
namespace: monitoring
resourceVersion: "8888421"
uid: 4fbc11f6-5547-45de-ad77-40fbd37d5e1d
spec:
alerting:
alertmanagers:
- apiVersion: v2
name: prometheus-kube-prometheus-alertmanager
namespace: monitoring
pathPrefix: /
port: http-web
enableAdminAPI: false
evaluationInterval: 30s
externalUrl: http://prometheus-kube-prometheus-prometheus.monitoring:9090
hostNetwork: false
image: quay.io/prometheus/prometheus:v2.42.0
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMonitorNamespaceSelector:
matchExpressions:
- key: kubernetes.io/metadata.name
operator: In
values:
- monitoring
- strimzi-system
podMonitorSelector: {}
portName: http-web
probeNamespaceSelector: {}
probeSelector:
matchLabels:
release: prometheus
replicas: 1
retention: 10d
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector:
matchLabels:
release: prometheus
scrapeInterval: 30s
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-kube-prometheus-prometheus
serviceMonitorNamespaceSelector:
matchLabels:
kubernetes.io/metadata.name: monitoring
serviceMonitorSelector: {}
shards: 1
version: v2.42.0
walCompression: true
Prometheus instance deployed to namespace A
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
creationTimestamp: "2023-02-06T11:23:09Z"
generation: 7
name: kafka-prometheus
namespace: A
resourceVersion: "8888395"
uid: 43ab949d-4b1f-4ad2-843c-5a22e7ecc1a1
spec:
additionalScrapeConfigs:
key: prometheus-additional.yaml
name: additional-scrape-configs
enableAdminAPI: false
evaluationInterval: 30s
podMonitorSelector:
matchLabels:
app: strimzi
replicas: 1
resources:
requests:
memory: 400Mi
retention: 30d
scrapeInterval: 30s
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: kafka-prometheus-server
serviceMonitorSelector: {}
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 10Gi
storageClassName: standard
PodMonitor for entity-operator-metrics
in namespace A
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
creationTimestamp: "2023-02-06T11:56:15Z"
generation: 1
labels:
app: strimzi
name: kafka-entity-operator-metrics
namespace: A
resourceVersion: "4290222"
uid: f1d14d38-e1ad-404f-94ba-7ff294923608
spec:
namespaceSelector:
matchNames:
- A
podMetricsEndpoints:
- path: /metrics
port: healthcheck
selector:
matchLabels:
app.kubernetes.io/name: entity-operator
PodMonitor for kafka-resource-metrics
in namespace A
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
creationTimestamp: "2023-02-06T11:56:15Z"
generation: 1
labels:
app: strimzi
name: kafka-resource-metrics
namespace: A
resourceVersion: "4290228"
uid: 48a4e919-51ab-433d-9dbe-8f0352d7341d
spec:
namespaceSelector:
matchNames:
- A
podMetricsEndpoints:
- path: /metrics
port: tcp-prometheus
relabelings:
- action: labelmap
regex: __meta_kubernetes_pod_label_(strimzi_io_.+)
replacement: $1
separator: ;
- action: replace
regex: (.*)
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_namespace
targetLabel: namespace
- action: replace
regex: (.*)
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_pod_name
targetLabel: kubernetes_pod_name
- action: replace
regex: (.*)
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_pod_node_name
targetLabel: node_name
- action: replace
regex: (.*)
replacement: $1
separator: ;
sourceLabels:
- __meta_kubernetes_pod_host_ip
targetLabel: node_ip
selector:
matchExpressions:
- key: strimzi.io/kind
operator: In
values:
- Kafka
- KafkaConnect
PodMonitor for cluster-operator-metrics
in namespace strimzi-system
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
creationTimestamp: "2023-02-12T14:30:33Z"
generation: 1
labels:
app: strimzi
name: kafka-cluster-operator-metrics
namespace: strimzi-system
resourceVersion: "8255294"
uid: 0986ba6a-dc83-4000-91b4-efc6f16462e4
spec:
namespaceSelector:
matchNames:
- strimzi-system
podMetricsEndpoints:
- path: /metrics
port: healthcheck
selector:
matchLabels:
strimzi.io/kind: cluster-operator
What I have tried:
- I have checked, that strimzi cluster operator has target labels
strimzi.io/kind: cluster-operator
and here is a part of pod manifest of strimzi operator
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2023-01-31T16:22:36Z"
generateName: strimzi-cluster-operator-77f9c84bc-
labels:
name: strimzi-cluster-operator
pod-template-hash: 77f9c84bc
strimzi.io/kind: cluster-operator
name: strimzi-cluster-operator-77f9c84bc-h9895
namespace: strimzi-system
- Also, I have checked that Prometheus service account deployed in
monitoring
has cluster role with permission to access metrics endpoint:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: monitoring
creationTimestamp: "2023-02-06T10:29:46Z"
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/instance: prometheus
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/part-of: kube-prometheus-stack
app.kubernetes.io/version: 45.0.0
chart: kube-prometheus-stack-45.0.0
heritage: Helm
release: prometheus
name: prometheus-kube-prometheus-prometheus
resourceVersion: "8223589"
uid: f13d2445-b185-4b75-aa92-ba97b7393da0
rules:
- apiGroups:
- ""
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs:
- get
- list
- watch
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
- /metrics/cadvisor
verbs:
- get
- I verified that strimzi operator pod returns metrics on endpoint
GET /metrics
- Tried to deploy PodMonitor into
monitoring
andA
namespace - Tried to disable any restrictions to discover pod monitors in Prometheus in namespace
monitoring
. It discovered 2 other pod monitors which is worked before, but still didn't find strimzi cluster operator pod monitor. Followed by this answer prometheus operator - enable monitoring for everything in all namespaces