24

I currently have the following Promql query which allow me to query the memory used by each of my K8S pods:

sum(container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}) by (pod_name)

The pod's name is followed by a hash defined by K8S:

weave-net-kxpxc
weave-net-jjkki
weave-net-asdkk

Which all belongs to the same app: weave-net

What I would like is to aggregate the memory of all pods which belongs to the same app.

So, the query would sum the memory of all weave-net pods and place the result in an app called weave. Such as the result would be:

{pod_name="weave-net"}            10

instead of

{pod_name="weave-net-kxpxc"}       5
{pod_name="weave-net-jjkki"}       3
{pod_name="weave-net-asdkk"}       2

Is it even possible to do so, and if yes, how ?

Mornor
  • 3,471
  • 8
  • 31
  • 69
  • I have the exact same problem and didn't find a solution do do this in this stage (I guess it is not possible). However, It is possible to add an additional label and use the regex in the relabel config of prometheus to get a label to group. If that is a possible solution for you, I can post an answer how to do this. – Thomas Böhm Sep 19 '18 at 07:24
  • Hi @ThomasBöhm! Would be happy to see your solution indeed. Personally, I just went to write a Python script that filter out results based on regex. – Mornor Sep 23 '18 at 09:37

2 Answers2

32

You can use label_replace

sum(label_replace(container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}, "pod_set", "$1", "pod_name", "(.*)-.{5}")) by (pod_set)

You will be including a new label (pod_set) that matches the first group ($1) from matching the regex over the pod_name label. Then you sum over the new label pod_set.

[Edited]

shaym2
  • 87
  • 2
  • 11
Iván Alegre
  • 1,818
  • 1
  • 14
  • 12
4

Iván's answer is spot on. label_replace doesn't work for range vectors (and it seems likely to stay that way), but there's a workaround using subqueries:

sum( 
  rate(
    label_replace(
      container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}, 
      "pod_set", 
      "$1", 
      "pod_name", 
      "(.*)-.{5}"
    )[5m:] <-- it's the trailing colon here that makes it work
  )
) by (pod_set)
James Mason
  • 4,246
  • 1
  • 21
  • 26
  • It is better just swapping `rate` with `label_replace` in the query. This eliminates the need to use subqueries and improves query performance and accuracy in general case. – valyala Sep 25 '22 at 09:45