The following PromQL query must be used for calculating the application uptime in seconds:
time() - process_start_time_seconds
This query works for all the applications written in Go, which use either github.com/prometheus/client_golang or github.com/VictoriaMetrics/metrics client libraries, which expose the process_start_time_seconds
metric by default. This metric contains unix timestamp for the application start time.
Kubernetes exposes the container_start_time_seconds
metric for each started container by default. So the following query can be used for tracking uptimes for containers in Kubernetes:
time() - container_start_time_seconds{container!~"POD|"}
The container!~"POD|"
filter is needed in order to filter aux time series:
- Time series with
container="POD"
label reflect e.g. pause containers
- see this answer for details.
- Time series without
container
label correspond to e.g. cgroups hierarchy. See this answer for details.
If you need to calculate the overall per-target uptime over the given time range, then it is possible to estimate it with up
metric. Prometheus automatically generates up
metric per each scrape target. It sets it to 1
per each successful scrape and sets it to 0
otherwise. See these docs for details. So the following query can be used for estimating the total uptime in seconds per each scrape target during the last 24 hours:
avg_over_time(up[24h]) * (24*3600)
See avg_over_time docs for details.