How to monitor cluster using Promeheus?

Question

Initially we had single node application and we used Prometheus where we set metrics path url to our single node application like this:

  - job_name: 'spring-actuator'
    metrics_path: '/prometheus'
    scrape_interval: 5s

For now we switched to the cloud application and if we set load balancer path - it will use different node each time so we will see some kind of mess. Is there way to aggregate metrics from the cluster using prometheus?

Prometheus can scape multiple targets for the same service (path). You can then get instance-specific metrics, or you can aggregate them. Would that not be enough? — ernest_k, Dec 27 '19 at 15:52
@ernest_k it would be enough(aggregation). Could you provide link? — gstackoverflow, Dec 27 '19 at 15:54
This answer my make a good example: https://stackoverflow.com/a/53313702/5761558 — ernest_k, Dec 27 '19 at 15:56
From what I know (I'm not a prometheus expert): the metrics will be collected *per instance*. Each row will know the server/instance that it was pulled from. Now when you query the prometheus database (I use grafana for that), you can select metrics across instances. So you could, for example, say *average response time for all `/service/resource` calls* (assuming you're exporting that). Having individual instances' metrics allows you to isolate servers as necessary (I use that to know which particular server is going down) — ernest_k, Dec 27 '19 at 16:30
@ernest_k But what if we use auto scaling? At this case we don't know how many instances we have — gstackoverflow, Dec 28 '19 at 09:35

score 1 · Answer 1 · answered Dec 27 '19 at 16:09

1

You should use prometheus to gather metrics from individual backends and then use aggregation in query or pre-aggregate data (using prometheus recording rules). Prometheus has a number of service discovery mechanism built-in and they can be used to automatically find and use all endpoints your app runs on.

For a taste of how configuration can look like you can see for example https://github.com/prometheus/prometheus/blob/release-2.15/config/testdata/conf.good.yml#L199

Depending on which cloud service you use you'll be using different _sd_config directives. All available ones are described in the documentation - https://prometheus.io/docs/prometheus/latest/configuration/configuration/

answered Dec 27 '19 at 16:09

bjakubski

1,477
10
9

But what if we use auto scaling? At this case we don't know amount of individual backends ans their addresses – gstackoverflow Dec 28 '19 at 09:37
That is exactly what service discovery in Prometheus is for. Instead of hardcoding individual backends you configure prometheus to auto discover all relevant backends (using for example AWS/GCP/K8s api). It will do so automatically - new targets will appear automatically and old ones will be removed. See the example I linked and note how instead of providing IP addresses etc. for machines to scrape one configures by (in this case) telling it to look dynamically for VMs that have specific tags set on them – bjakubski Dec 29 '19 at 11:01

How to monitor cluster using Promeheus?

1 Answers1