9

I am trying to solve a problem of making a sum and group by query in Prometheus on a metric where the labels assigned to the metric values to unique to my sum and group by requirements.

I have a metric sampling sizes of ElasticSearch indices, where the index names are labelled on the metric. The indices are named like this and are placed in the label "index":

project.<projectname>.<uniqueid>.<date>

with concrete value that would look like this:

project.sample-x.ad19f880-2f16-11e7-8a64-jkzdfaskdfjk.2018.03.12

project.sample-y.jkcjdjdk-1234-11e7-kdjd-005056bf2fbf.2018.03.12

project.sample-x.ueruwuhd-dsfg-11e7-8a64-kdfjkjdjdjkk.2018.03.11

project.sample-y.jksdjkfs-2f16-11e7-3454-005056bf2fbf.2018.03.11

so if I had the short version of values in the "index" label I would just do:

sum(metric) by (index)

but what I am trying to do is something like this:

sum(metric) by ("project.<projectname>")

where I can group by a substring of the "index" label. How can this be done with a Prometheus query? I assume this could maybe be solved using a label_replace as part of the group, but I can't just see how to "truncate" the label value to achieve this.

Best regards

Lars Milland

Lars Milland
  • 93
  • 1
  • 1
  • 3
  • see https://stackoverflow.com/a/45163502/1062129 for a possible way to do this with `label_replace` – George Oct 17 '19 at 11:21

3 Answers3

6

While it'd be best to fix the metrics, the next best thing is to use metric_relabel_configs using the same technique as this blog post:

  metric_relabel_configs:
  - source_labels: [index]
    regex: 'project\.([^.]*)\..*'
    replacement: '${1}'
    target_label: project

You will then have a project label that you can use as usual.

brian-brazil
  • 31,678
  • 6
  • 93
  • 86
2

It is possible to use label_replace() function in order to extract the needed parts of the label into a separate label and then group by this label when summing the results. For example, the following query extracts the project.sample-y from project.sample-y.jksdjkfs-2f16-11e7-3454-005056bf2fbf.2018.03.11 value stored in the index label and puts the extracted value into project_name label. Then the sum() is grouped by project_name label values:

sum(
  label_replace(metric, "project_name", "$1", "index", "(project[.][^.]+).+")
) by (project_name)
valyala
  • 11,669
  • 1
  • 59
  • 62
1

It looks as if the index label naming convention is wrong. These constituents should clearly be separate metric labels instead. So, instead of project.<projectname>.<uniqueid>.<date> you should be storing timeseries with labels as in:

project{projectname="xxx", uniqueid="yyy"}

As for the date part, I assume this already covered by the sample timestamps themselves? There are plenty of functions like timestamp(), month(), day() etc to manipulate the timestamp.

So you have two options:

  • fix the metrics exporter to separate the labels instead of concatenating this information into the metric name itself,
  • OR, if you can't alter the exporter, use metric relabeling to convert the index label into new time series with the label set I described above. You can then use standard Prometheus functions as you described. See this article for an example of how you could do this.
ekarak
  • 608
  • 6
  • 15
  • 1
    I am not in control of the metric exporter, and would also not like to do this in a relabeling exercise when scraping the metrics by Prometheus. I am looking for a solution with Prometheus query. But as question is stated, I simply can't figure out how to drop parts of the "index" label so I can group by only the first parts and not the whole full label value. So if anyone know how to trim/truncate the label as part of a query, and then use the trimmed value in a group please advice. – Lars Milland Mar 17 '18 at 11:37
  • try using the `__name__` hidden label. Something in the likes of: `sum({_name__=~"project.[a|b|c}*"})` – ekarak Mar 17 '18 at 12:14
  • Maybe I am explaining my problem wrong. It is not the name of the metric that I want to truncate parts of and group by. It is a label on the metric. – Lars Milland Mar 17 '18 at 12:27
  • So what I thought would work is something like: sum(metric) by (label_replace(....)) I just don't know how to use the label_replace function inside the grouping part and truncate the the "index" label using some smart regular expression. – Lars Milland Mar 17 '18 at 12:35