Why is CPU utilization calculated using irate or rate in Prometheus?

Question

I know that CPU utilization is given by the percentage of non-idle time over the total time of CPU. In Prometheus, rate or irate functions calculate the rate of change in a vector array.

People often calculate the CPU utilisation by the following PromQL expression:

(100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100))

I don't understand how calculating the per second change of non-idle time is equivalent to calculating the CPU usage. Can somebody explain mathematically why this makes sense?

score 21 · Accepted Answer · answered Apr 11 '19 at 09:02

There are a couple of things to unwrap here.

First, rate vs irate. Neither the linked question, nor the blog post address this (but Eitan's answer does touch on it). The difference is that rate estimates the average rate over the requested range (1 minute, in your case) while irate computes the rate based on the last 2 samples only. Leaving aside the "estimate" part (see this answer if you're curious) the practical difference between the 2 is that rate will smooth out the result, whereas irate will return a sampling of CPU usage, which is more likely to show extremes in CPU usage but also more prone to aliasing.

E.g. if you look at Prometheus' CPU usage, you'll notice that it's at a somewhat constant baseline, with a spike every time a large rule group is evaluated. Given a time range that was at least as long as Prometheus' evaluation interval, if you used rate you'd get a more or less constant CPU usage over time (i.e. a flat line). With irate (assuming a scrape interval of 5s) you'd get one of 2 things:

if your resolution (i.e. step) was not aligned with Prometheus' evaluation interval (e.g. the resolution was 1m and the evaluation interval was 13s) you'd get a random sampling of CPU usage and would hopefully see values close to both the highest and lowest CPU usage over time on a graph;
if your resolution was aligned with Prometheus' evaluation interval (e.g. 1m resolution and 15s evaluation interval) then you'd either see the baseline CPU usage everywhere (because you happen to look at 5s intervals set 1 minute apart, when no rule evaluation happens) or the peak CPU usage everywhere (because you happen to look at 5s intervals 1 minute apart that each cover a rule evaluation).

Regarding the second point, the apparent confusion over what the node_cpu_seconds_total metric represents, it is a counter. Meaning it's a number that increments continuously and essentially measures the amount of time the CPU was idle since the exporter started. The absolute value is not all that useful (as it depends on when the exporter started and will drop to 0 on every restart). What's interesting about it is by how much it increased over a period of time: from that you can compute for a given period of time a rate of increase per second (average, with rate; instant, with irate) or an absolute increase (with increase). So both rate(node_cpu_seconds_total{mode="idle"}[1m]) and irate(node_cpu_seconds_total{mode="idle"}[1m]) will give you a ratio (between 0.0 and 1.0) of how much the CPU was idle (over the past minute, and respectively between the last 2 samples).

Thanks for pointing out that this value is a counter. I just realized that `node_cpu_seconds_total{mode="idle"}` is supposed to increment by 1 every second if the CPU is fully idle in that interval. So the rate of increase in the value would directly reflect the average "seconds" spent on idle per second. — lxjhk, Apr 12 '19 at 17:15
I wrote [an article](https://valyala.medium.com/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832) explaining why `irate()` shouldn't be used in most cases, since it doesn't capture spikes. The article proposes better solution - `rollup_rate()` function from [MetricsQL](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/MetricsQL) - for capturing all the spikes. — valyala, Dec 05 '20 at 23:15

score 0 · Answer 2 · answered Apr 07 '19 at 06:59

0

Looks like this is already answered here: Prometheus - Convert cpu_user_seconds to CPU Usage %? Looking at the provided link in the answers: https://www.robustperception.io/understanding-machine-cpu-usage you can see the explanation. Personally, I think that irate in this context makes more sense, as it will show you the average on the last active points (vs. rate which will average the entire sampled timeslot).

answered Apr 07 '19 at 06:59

Eitan

128
6

I saw that post but I am not sure if answers there address my question – lxjhk Apr 07 '19 at 07:39
"As these values always sum to one second per second for each cpu, the per-second rates are also the ratios of usage. We can use this to calculate the percentage of CPU used, by subtracting the idle usage from 100%:". – Eitan Apr 07 '19 at 07:55
Could you explain the logic here : these values always sum to one second per second for each cpu => the per-second rates are also the ratios of usage. I thought the value itself is the ratio of usage? – lxjhk Apr 09 '19 at 00:32
The rates are per-second, but the query is an aggregation of per 1 minute. The value is not a rate but the total (as in "node_cpu_seconds_total"). The query will give you an average rate for a minute. – Eitan Apr 10 '19 at 05:51

Why is CPU utilization calculated using irate or rate in Prometheus?

2 Answers2

Linked