we are doing case study with control-m to monitor Kubernetes job. On successful completions of job, control -m is able to recognize the job completed. however when it fails, it never recognize the failure it shows job is still running,i suspect job never gets completed in kubernetes.
Here as job, pod status and kubernetes yaml file.
My question, is there way to kubernetes job complete with failure? or is it default behavior of kubernetes?
# kubectl -n ns-dev get job
NAME COMPLETIONS DURATION AGE
job-pod-failure-policy-example 0/1 3m39s 3m39s
# kubectl -n ns-dev get pods
NAME READY STATUS RESTARTS AGE
job-pod-failure-policy-example-h86bp 0/1 Error 0 82s
Yaml file:
apiVersion: batch/v1
kind: Job
metadata:
name: job-pod-failure-policy-example
spec:
template:
spec:
restartPolicy: Never
containers:
- name: main
image: docker.io/library/bash:5
command: ["bash"] # example command simulating a bug which triggers the FailJob action
args:
- -c
- echo "Hello world!" && sleep 5 && exit 1
backoffLimit: 0
podFailurePolicy:
rules:
- action: Terminate
onExitCodes:
containerName: main
operator: In
values: [1]
I have gone through below link to help to set the backoff limit to zero which helped stop retriggering multiple times.
Kubernetes job keeps spinning up pods which end up with the 'Error' status