2

I run Kube under EKS on AWS. I wish to alert when enough pod under from a deployement are in a failed phase. I do have prometheus, but I need the alerte to be in CloudWatch, therefore I am exporting the metric to CloudWatch thanks to the CW Agent.

I was planing into using the metric kube_pod_status_phase, and group it by Phase and a another label identifying my deployement.

But I realize that the kube_pod_status_phase metric only has Namespace/Pod/Phase and couple of other useless label in my case, so not enough for me to achieve what I need.

I see that with Prometheus PromQL I could make Join Query, which seem to solve my issue. But since I am using CloudWatch Metric, I can not use PromQL like this (or at least I do not know how).

Does anyone has a suggestion on how to solve this issue? How, with AWS CloudWatch, can I list for one specific deployement the list of pod in a failed state?

Djoby
  • 602
  • 1
  • 6
  • 22
  • 1
    Is using the Cloudwatch agent to forward prometheus metrics to CloudWatch an option? If so, you can create a prometheus rule containing the join and forward the metric to CloudWatch. – Yaron Idan Jan 29 '23 at 11:56
  • thanks, I do forward the metrics from Prometheus to CW yes. What you suggest seems to be exactly what I need, but I can not find documentation about how to do so, could you lead me to some please? – Djoby Jan 29 '23 at 20:40
  • Sure, I'll post the comment + a link to the docs as an answer so you can accept it. Editing this comment to make sure I understand your question - are you looking for documentation for how to forward logs to CW, or how to create a prometheus rule that will contain the join? – Yaron Idan Jan 29 '23 at 21:14
  • I forward with success logs to CW, by following this doc: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-PrometheusEC2.html But I do not know how to make an actual prometheus rule with that. – Djoby Jan 29 '23 at 21:32
  • To be maybe a bit more clear. I know how to make request (with or withoout join) if I am hitting Prometheus API. But here, I am not hiting the API, I am configuring the CW Agent to scrap metrics. I am do not know how to configure it so it actually make request, and not "just" scrapping the metrics. – Djoby Jan 29 '23 at 23:04
  • 1
    You can configure the rule in Prometheus - and it will configure a new metric that should be scraped by the CW agent. See prometheus docs here -https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#recording-rules – Yaron Idan Feb 02 '23 at 11:09
  • So after more deeging, I am not sure it will work. The CWAgent basically run its own mini Prometheus, which does not seems to support the RecordingRules. So instead I think I am going to make a lambda in AWS that query Prometheus regularly and export the metrics to CW – Djoby Feb 03 '23 at 23:02

0 Answers0