I have an kubernetes cluster setup and would like to use datadog to monitor the cluster. I want to setup a monitor to alert me if a container/pod is stuck in the CarshLoopBackOff state.
How do I do this?
I have an kubernetes cluster setup and would like to use datadog to monitor the cluster. I want to setup a monitor to alert me if a container/pod is stuck in the CarshLoopBackOff state.
How do I do this?
This is a little counter intuitive, but you need to look for pods that are not flagged by pod.ready. Example find # of failing pods on cluster my-cluster in the default namespace for a monitor:
exclude_null(sum:kubernetes_state.pod.ready{condition:false,kube_cluster_name:my-cluster,kube_namespace:default,!pod_phase:succeeded})