33

I have Prometheus scraping metrics from node exporters on several machines with a config like this:

scrape_configs:
  - job_name: node_exporter
    static_configs:
      - targets:
        - 1.2.3.4:9100
        - 2.3.4.5:9100
        - 3.4.5.6:9100

When viewed in Grafana, these instances are assigned rather meaningless IP addresses; instead, I would prefer to see their hostnames. I think you should be able to relabel the instance label to match the hostname of a node, so I tried using relabelling rules like this, to no effect whatsoever:

relabel_configs:
  - source_labels: ['nodename']
    target_label: 'instance'

I can manually relabel every target, but that requires hardcoding every hostname into Prometheus, which is not really nice. I see that the node exporter provides the metric node_uname_info that contains the hostname, but how do I extract it from there?

node_uname_info{domainname="(none)",machine="x86_64",nodename="myhostname",release="4.13.0-32-generic",sysname="Linux",version="..."} 1
aSemy
  • 5,485
  • 2
  • 25
  • 51
Norrius
  • 7,558
  • 5
  • 40
  • 49

10 Answers10

24

I just came across this problem and the solution is to use a group_left to resolve this problem. You can't relabel with a nonexistent value in the request, you are limited to the different parameters that you gave to Prometheus or those that exists in the module use for the request (gcp,aws...).

So the solution I used is to combine an existing value containing what we want (the hostnmame) with a metric from the node exporter. Our answer exist inside the node_uname_info metric which contains the nodename value.

I used the answer to this post as a model for my request: https://stackoverflow.com/a/50357418 .

The solution is this one:

node_memory_Active_bytes
  * on(instance) group_left(nodename)
node_uname_info

With this, the node_memory_Active_bytes metric which contains only instance and job labels by default, gets an additional nodename label that you can use in the description field of Grafana.

Hope that this will help others.

valyala
  • 11,669
  • 1
  • 59
  • 62
night-gold
  • 2,202
  • 2
  • 20
  • 31
15

This solution stores data at scrape-time with the desired labels, no need for funny PromQL queries or hardcoded hacks. It does so by replacing the labels for scraped data by regexes with relabel_configs.

By default, instance is set to __address__, which is $host:$port.

First attempt: In order to set the instance label to $host, one can use relabel_configs to get rid of the port of your scaping target:

  - job_name: 'whatever'
    static_configs:
      - targets: [
            'yourhost.lol:9001'
        ]
    relabel_configs:
      - source_labels: [__address__]
        target_label: instance
        regex: '([^:]+)(:[0-9]+)?'
        replacement: '${1}'

But the above would also overwrite labels you wanted to set e.g. in the file_sd_configs:

[
    {
        "targets": ['yourhost.lol:9001'],
        "labels": {
            "instance": 'node42'
        }
    }
]

Solution: If you want to retain these labels, the relabel_configs can rewrite the label multiple times be done the following way:

  - job_name: 'whatever'
    metrics_path: /metric/rolf
    file_sd_configs:
      - files:
        - rolf_exporter_targets.yml
    relabel_configs:
      - source_labels: [instance]
        target_label: __tmp_instance
        regex: '(.+)'
        replacement: '${1};'
      - source_labels: [__tmp_instance, __address__]
        separator: ''
        target_label: instance
        regex: '([^:;]+)((:[0-9]+)?|;(.*))'
        replacement: '${1}'

Doing it like this, the manually-set instance in sd_configs takes precedence, but if it's not set the port is still stripped away.

TheJJ
  • 931
  • 12
  • 21
  • I'm not sure if that's helpful. My target configuration was via IP addresses (`1.2.3.4`), not hostnames (`yourhost.lol`). – Norrius Aug 15 '20 at 11:02
  • it should work with hostnames and ips, since the replacement regex would split at `:` in both cases. I would rather argue that the accepted answer with a promql solution doesn't do any relabeling, but keeps all labels intact, just displays the data differently. – TheJJ Oct 03 '20 at 21:59
  • This was helpful. I was trying to use `sourceLabels: ["instance"]` but I should have been using `["__address__"]` when doing relabels for my ServiceMonitor. I guess instance get's added later? – SlyGuy Dec 06 '22 at 03:43
  • Doesn't that fail when a manually set `instance` contains a `;`? – calestyo Mar 08 '23 at 02:19
  • @TheJJ .. that, AFAICS, works in all cases: https://pastebin.com/KyrXmx9s – calestyo Mar 08 '23 at 13:06
  • wouldn't be simpler to define another scrape job that just scrapes the hostname (and whatever else you want) from the host? – mike01010 May 13 '23 at 22:10
10

I found hardcode solution:

global:
  scrape_interval: 5s
  scrape_timeout: 5s
  external_labels:
    monitor: 'Prometheus'

scrape_configs:

  - job_name: 'shelby'
    static_configs:
    - targets:
      - 10.100.0.01:9100
    relabel_configs:
      - source_labels: [__address__]
        regex: '.*'
        target_label: instance
        replacement: 'shelby'
  
  - job_name: 'camaro'
    static_configs:
    - targets:
      - 10.101.0.02:9100
    relabel_configs:
      - source_labels: [__address__]
        regex: '.*'
        target_label: instance
        replacement: 'camaro'
  
  - job_name: 'verona'
    static_configs:
      - targets:
      - 10.101.0.03:9100
    relabel_configs:
      - source_labels: [__address__]
        regex: '.*'
        target_label: instance
        replacement: 'verona'

Result:


    node_load15{instance="camaro",job="camaro"}    0.16
    node_load15{instance="shelby",job="shelby"}    0.4
    node_load15{instance="verona",job="verona"}    0.07

aSemy
  • 5,485
  • 2
  • 25
  • 51
vitams
  • 585
  • 6
  • 7
4

You don't have to hardcode it, neither joining two labels is necessary. You can place all the logic in the targets section using some separator - I used @ and then process it with regex. Please find below an example from other exporter (blackbox), but same logic applies for node exporter as well. In your case please just include the list items where:

  • target_label: app_ip
  • target_label: instance.
- job_name: 'blackbox'
    metrics_path: '/probe'
    scrape_interval: 15s
    params:
      module: [ http_2xx ]
    static_configs:
      - targets:
        - "1.2.3.4:8080@JupyterHub"
        - "1.2.3.5:9995@Zeppelin"
        - "1.2.3.6:8080@Airflow UI"
    relabel_configs:
      - source_labels: [ __address__ ]
        regex: '(.*)@.*'
        replacement: $1
        target_label: __param_target
      - source_labels: [ __address__ ]
        regex: '(.*)@.*'
        replacement: $1
        target_label: app_ip
      - source_labels: [ __address__ ]
        regex: '.*@(.*)'
        replacement: $1
        target_label: instance
      - target_label: __address__
        replacement: '{{ blackbox_exporter_host }}:{{ blackbox_exporter_port }}'
aSemy
  • 5,485
  • 2
  • 25
  • 51
michalrudko
  • 1,432
  • 2
  • 16
  • 30
  • For redis we use targets like described in https://github.com/oliver006/redis_exporter/issues/623. Ex `- targets: ["redis://168.x.x.x:17539","redis://169.x.x.x:17539")` How do we make it work there ? – Khurshid Alam Mar 28 '22 at 15:32
4

If you use Prometheus Operator add this section to your ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
...
spec:
  endpoints:
  - relabelings:
    - sourceLabels: [__meta_kubernetes_pod_node_name]
      targetLabel: instance
3

group_left unfortunately is more of a limited workaround than a solution. I've been trying in vai for a month to find a coherent explanation of group_left, and expressions aren't labels. Having to tack an incantation onto every simple expression would be annoying; figuring out how to build more complex PromQL queries with multiple metrics is another entirely. It would also be less than friendly to expect any of my users -- especially those completely new to Grafana / PromQL -- to write a complex and inscrutable query every time.

My first stab was something like this:

  - job_name: 'node_exporter'
    scrape_interval: 10s
    static_configs:
      - targets: ['1.2.3.4:9100']
        labels:
          cluster: 'rkv-image01'
          ceph_role: 'mon'
          instance_node: 'rkv1701'

Which is frowned on by upstream as an "antipattern" because apparently there is an expectation that instance be the only label whose value is unique across all metrics in the job. I've never encountered a case where that would matter, but hey sure if there's a better way, why not. There's the idea that the exporter should be "fixed', but I'm hesitant to go down the rabbit hole of a potentially breaking change to a widely used project. I'm also loathe to fork it and have to maintain in parallel with upstream, I have neither the time nor the karma.

Next I tried metrics_relabel_configs but that doesn't seem to want to copy a label from a different metric, ie. node_uname_info{nodename} -> instance -- I get a syntax error at startup.

Next I came across something that said that Prom will fill in instance with the value of address if the collector doesn't supply a value, and indeed for some reason it seems as though my scrapes of node_exporter aren't getting one. Which seems odd. But what I found to actually work is the simple and so blindingly obvious that I didn't think to even try:

  - job_name: 'node_exporter'
    scrape_interval: 10s
    static_configs:
      - targets: ['1.2.3.4:9100']
        labels:
          cluster: 'rkv-image01'
          ceph_role: 'mon'
          instance: 'rkv1701'
...

I.e., simply applying a target label in the scrape config. I'm working on file-based service discovery from a DB dump that will be able to write these targets out.

It may be a factor that my environment does not have DNS A or PTR records for the nodes in question. Yes, I know, trust me I don't like either but it's out of my control. But still that shouldn't matter, I dunno why node_exporter isn't supplying any instance label at all since it does find the hostname for the info metric (where it doesn't do me any good).

$ curl http://1.2.3.4:9100/metrics | grep instance
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 80082    0 80082    0     0  4383k      0 --:--:-- --:--:-- --:--:-- 4600k

$ curl http://1.2.3.4:9100/metrics | grep rkv1701
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 80085    0 80085    0  node_uname_info{domainname="(none)",machine="x86_64",nodename="rkv1701.myco.com",release="4.17.13-1.el7.elrepo.x86_64",sysname="Linux",version="#1 SMP Mon Aug 6 14:16:00 EDT 2018"} 1
   0  9268k      0 --:--:-- --:--:-- --:--:-- 9776k
$
aSemy
  • 5,485
  • 2
  • 25
  • 51
anthonyeleven
  • 121
  • 1
  • 3
3

You can use a relabel rule like this one in your prometheus job desription:

- job_name: node-exporter
....
relabel_configs:
  .....
  # relable the label 'instance' with your pod_node_name 
  - source_labels: [__meta_kubernetes_pod_node_name]
    target_label: instance

In the prometheus Service Discovery you can first check the correct name of your label. The label will end with '....pod_node_name'

Ralph
  • 4,500
  • 9
  • 48
  • 87
2

Another answer is to using some /etc/hosts or local dns (Maybe dnsmasq) or sth like Service Discovery (by Consul or file_sd) and then remove ports like this:

relabel_configs:
  - source_labels: ['__address__']
    separator:     ':'
    regex:         '(.*):.*'
    target_label:  'instance'
    replacement:   '${1}'
aSemy
  • 5,485
  • 2
  • 25
  • 51
Salehi
  • 362
  • 2
  • 13
0

Additional config for this answer: https://stackoverflow.com/a/64623786/2043385

  - job_name: 'node-exporter'
    kubernetes_sd_configs:
      - role: endpoints
    relabel_configs:
      - source_labels: [__meta_kubernetes_endpoints_name]
        regex: 'node-exporter'
        action: keep
      - source_labels: [__meta_kubernetes_pod_node_name]
        action: replace
        target_label: node

And my service:

kind: Service
apiVersion: v1
metadata:
  name: node-exporter
  namespace: monitoring
spec:
  selector:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: node-exporter
  ports:
    - name: node-exporter
      protocol: TCP
      port: 9100
      targetPort: 9100
aSemy
  • 5,485
  • 2
  • 25
  • 51
vitams
  • 585
  • 6
  • 7
0

Nobody yet provided a correct answer to the initial question.

It's not possible to use nodename for relabeling instance: https://groups.google.com/g/prometheus-developers/c/ZX42ASznnbw

Relabelling can only access metadata of the target, metric relabelling can only access the labels of the same metric.

A more detail explanation is here: https://www.robustperception.io/why-cant-i-use-the-nodename-of-a-machine-as-the-instance-label/

The machine knows its own name, couldn't Prometheus use it?

This is a not uncommon question about Prometheus and service discovery. If you run uname -n you'll see a machine's nodename, and this is something that'd be useful to have as part of your instance label. The node exporter even exposes it as the nodename label on the node_uname_info metric!

This is not something that Prometheus can do. The reason is that the target labels are determined by service discovery and relabelling, before Prometheus ever attempts to talk to a scrape target. You could in principle have the scrape target attach it as a label on every sample, but that goes against the top-down strategy that Prometheus uses for configuration management. Put another way, a target doesn't know how a scraper views it. For example the database team, application team, and infrastructure teams may all have different views on where a target belongs in their label taxonomy. One person's primary database is another's 4th machine in the 7th rack in the Irish datacenter.

The workaround, but not a solution (!) is to use DNS name for the target instead of the IP and keep the hostname consistent with your DNS infrastructure.

Another way is just to hardcode labels for a target:

scrape_configs:
  - job_name: node_exporter
    static_configs:
      - targets:[1.2.3.4:9100]
        labels:
          instance: myhost1.org
      - targets:[2.3.4.5:9100]
        labels:
          instance: myhost2.org
Roman Shishkin
  • 2,097
  • 20
  • 21