I run a build script multiple times per day. My feeling is that me and my colleagues spend a considerable amount of time waiting for this script to execute. Now I want to know: how much time do we spend per day waiting for the script to execute?. I can be satisfied with an over all average, even though I would really like to have day-by-day data (e.g. "last Monday we spent X minutes waiting for the script to execute, on Tuesday...)
To find the answer, I spun up Prometheus with a push gateway. In the build script, I added a REST call to the push gateway that posts metrics (type: counter
) labeled with machine name and sample data the elapsed time to execute the script.
Data is being collected, but I realize that the data I gather isn't enough to answer my question, I would need the metrics I push (that is: the current runs elapsed time) to be accumulated to previous data. Looking at the documentation, I get the feeling that this will not be supported through the push gateway:
The Pushgateway is explicitly not an aggregator or distributed counter but rather a metrics cache
My questions are:
- Is it possible to collect the metrics I want through Prometheus Push Gateway. If not, what are my options?
- If it is possible, what metrics should I collect how?