How to get cpu and memory usage of nodes/pods in prometheus?

Question

As beginener, I have tried k9s and kubernetes 'kubectl top nodes',for the cpu and memory usage and values are matched.Meanwhile I tried with prometheus UI, with 'avg(container_cpu_user_seconds_total{node="dev-node01"})' and 'avg(container_cpu_usage_seconds_total{node="dev-node01"})' for dev-node01. I cant get matching values.Any help will be appreciated as I am beginner.please any help would be appreciated.

valyala · Answer 1 · 2023-01-20T06:35:57.103

2

The following PromQL query returns CPU usage (as the number of used CPU cores) by pod in every namespace:

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (namespace, pod)

The container!="" filter is needed for filtering out cgroups hierarchy metrics - see this answer for details.

The following query returns memory usage by pod in every namespace:

sum(container_memory_usage_bytes{container!=""}) by (namespace, pod)

The following query returns CPU usage (as the number of used CPU cores) by node:

sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (node)

The following query returns memory usage by node:

sum(container_memory_usage_bytes{container!=""}) by (node)

edited Jan 20 '23 at 06:35

answered Jun 23 '22 at 18:46

valyala

11,669
1
59
62

1

`sum(rate(container_cpu_usage_seconds_total{container!=""})) by (namespace, pod)` does not work because of the rate function. – Yuhang Lin Jan 20 '23 at 05:35
Thanks, fixed queries with `rate()` functions above – valyala Jan 27 '23 at 08:38

score 0 · Answer 2 · answered Feb 05 '21 at 06:37

0

if metrics 'container_cpu_user_seconds_total' showing output then it should work. I used the same query which you mentioned above and it's working for me. Check the graph and console tab as well in Prometheus.

Please try this

avg(container_cpu_user_seconds_total{node="NODE_NAME"})

answered Feb 05 '21 at 06:37

Manjul

44
2

I am getting different values – Aksahy Awate Feb 05 '21 at 14:56

score 0 · Accepted Answer · answered Feb 09 '21 at 21:32

0

container_cpu_usage_seconds_total is a counter, so you'll want to read up on what that means and how to query a counter.

In this case, you'll likely want to use the rate function, documented here.

rate(v range-vector) calculates the per-second average rate of increase of the time series in the range vector... rate should only be used with counters... Note that when combining rate() with an aggregation operator, always take a rate() first, then aggregate.

An example query for per-pod CPU usage would be:

sum(rate(container_cpu_usage_seconds_total{}[1h])) by (pod_name, namespace)

Also, you may want to check out Kubecost for k8s allocation metrics and APIs.

answered Feb 09 '21 at 21:32

Niko

658
8
14

how to check for per node ? @Niko Kovacevic. I tried but could not match values with `kubectl top nodes`. – Aksahy Awate Feb 11 '21 at 11:21
You can group by `node` instead of `pod_name, namespace`: `sum(rate(container_cpu_usage_seconds_total{}[1h])) by (node)` Also note that this is different than running `kubectl top nodes` in that it averages the usage over the given window. But it should be close. – Niko Feb 11 '21 at 23:14
yeah, I am getting close values. Did `avg` instead of `sum`. Meanwhile i tried with kubecost, looks interesting. Thanks for helping out.@Niko Kovacevic – Aksahy Awate Feb 12 '21 at 04:10

How to get cpu and memory usage of nodes/pods in prometheus?

3 Answers3