Duplicate metrics with multiple instances of kube-state-metrics

Question

Problem:

Duplicate data when querying from prometheus for metrics from kube-state-metrics.

Sample query and result with 3 instances of kube-state-metrics running:

Query:

kube_pod_container_resource_requests_cpu_cores{namespace="ns-dummy"}

Metrics

kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}

Observation:

Every metric is coming up Nx when N pods are running for kube-state-metrics. If it's a single pod running, we get the correct info.

Possible solutions:

Scale down to single instance of kube-state-metrics. (Reduced availability is a concern)
Enable sharding. (Solves duplication problem, still less available)

According to the docs, for horizontal scaling we have to pass sharding arguments to the pods.

Shards are zero indexed. So we have to pass the index and total number of shards for each pod.

We are using Helm chart and it is deployed as a deployment.

Questions:

How can we pass different arguments to different pods in this scenario, if its possible?
Should we be worried about availability of the kube-state-metrics considering the self-healing nature of k8s workloads?
When should we really scale it to multiple instances and how?

I don't see an option to add sharding in the current helm chart, so you'd have to modify it. Why would you say sharding reduces availability? It's all stateless anyway, most damage you get is you lack metrics for some parts of the estate if one of the instances is down. — favoretti, Jan 14 '20 at 12:05

score 1 · Accepted Answer · answered Jan 21 '20 at 09:30

You could use a 'self-healing' deployment with only a single replica of kube-state-metric if the container down, the deployment will start a new container. Since kube-state-metric is not focused on the health of the individual kubernetes components. It only will affect you if your cluster is too big and make many objects changes per second.

It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.

For small cluster there's is no problem use in this way, but you really need a high availability monitoring platform I recommend you take a look in this two articles: creating a well designed and highly available monitoring stack for kubernetes and kubernetes monitoring

Duplicate metrics with multiple instances of kube-state-metrics

1 Answers1