1

I am looking at this article. It mentions below query with below description.

It is even more interesting monitoring the error rate vs the total amount of requests, this way you get the magnitude of the errors for the Kubernetes API server requests.

sum(rate(apiserver_request_total{job="kubernetes-apiservers",code=~"[45].."}[5m]))*100/sum(rate(apiserver_request_total{job="kubernetes-apiservers"}[5m]))

What I don't understand is why can we just compute below? Moreover, what is the purpose of applying sum function after rate? My understanding is rate is the average change per second, why sums the changes up? Can I get help on understanding the above query, it would be great to have an concrete example. Thanks in advance!

count(apiserver_request_total{job="kubernetes-apiservers",code=~"[45].."})/count(apiserver_request_total{job="kubernetes-apiservers"})
bunny
  • 1,797
  • 8
  • 29
  • 58

1 Answers1

1

rate calculates per-second estimation of increase of your counter. sum than sums up all time series produced by rate into one.

Consider this demo. Notice how rate produces eight time series (same as number of "input" series), while sum - only one.

Please notice also difference between sum and sum_over_time, as it is often reason of confusion.


count returns number of time series within inner query. It's result more or less constant over time.

For demo linked above count will always return vector(8), as there are always 8 modes of memory.

markalex
  • 8,623
  • 2
  • 7
  • 32