2

I have a service which currently exposes metrics with a prometheus client at port 8000.

I want to change this process to be a controlling parent process, who spawns one (independent) child process per "thing" i.e. the child process is parameterised by thing_id and the parent calls systemctl start child@thing_id

The metrics exposed are independent per "thing", so I still want to expose these metrics per process, using a label thing to identify and analyse accordingly.

However it seems I must run start_http_server(8000) within each service, which will error due to "port already in use". I have tried only starting the server in the parent process, but none of the metrics from the child processes are observed.

I could define some port range and have one http server per process, and increase my number of targets accordingly, but that seems rather clunky.

I have looked into https://github.com/prometheus/client_python/blob/master/prometheus_client/multiprocess.py which I believe may be a solution, but thought there might be an easier way just using subprocessing and systemctl ...

ifo20
  • 738
  • 8
  • 20

1 Answers1

2

In the end I used prometheus "push gateway" to achieve this - creating within each child service a GatewayClient who pushes based on a job id. The parent service does clean up when it stops a child process with delete_from_gateway.

Installed push gateway with help from https://sysadmins.co.za/install-pushgateway-to-expose-metrics-to-prometheus/

Basic client code below

from prometheus_client import push_to_gateway, CollectorRegistry

class GatewayClient:
    def __init__(self, addr, thing_id):
        self.gateway_address = addr
        self.job_id = "thing-{}".format(thing_id)
        self.registry = CollectorRegistry()
        # define metrics etc
        self.some_counter = Counter('some_counter', 'A counter', registry=self.registry)

    async def start(self):
        while True:
            # push latest data
            push_to_gateway(self.gateway_address, job=self.job_id, registry=self.registry
            await asyncio.sleep(15)

    # methods to set metrics etc
    def inc_some_counter(self):
        self.some_counter.inc()
ifo20
  • 738
  • 8
  • 20
  • Quick question: What should i pass as addr and thing_id? I have similar situation and trying to expose metrics to sub-process. Here is my question link: [link](https://stackoverflow.com/questions/69574908/how-to-collect-prometheus-metrics-from-multiple-python-flask-sub-process/69581436#69581436) @ifo10 – Sahil Sangani Oct 15 '21 at 15:51
  • @SahilSangani addr is the address of the push gateway instance (in my case it was localhost:9091 as the pushgateway was running in the same host as the application), and thing_id is essentially a subprocess identifier (in my example, each subprocess had an identifying parameter, and the metrics exposed always would include that as a label) - not sure if this works for your flask use case; you need to ensure that subprocesses don't use the same identifier – ifo20 Oct 16 '21 at 13:28