162

Is there a way to automatically remove completed Jobs besides making a CronJob to clean up completed Jobs?

The K8s Job Documentation states that the intended behavior of completed Jobs is for them to remain in a completed state until manually deleted. Because I am running thousands of Jobs a day via CronJobs and I don't want to keep completed Jobs around.

d4nyll
  • 11,811
  • 6
  • 54
  • 68
Josh Newman
  • 1,645
  • 2
  • 11
  • 6

9 Answers9

213

You can now set history limits, or disable history altogether, so that failed or successful CronJobs are not kept around indefinitely. See my answer here. Documentation is here.

To set the history limits:

The .spec.successfulJobsHistoryLimit and .spec.failedJobsHistoryLimit fields are optional. These fields specify how many completed and failed jobs should be kept. By default, they are set to 3 and 1 respectively. Setting a limit to 0 corresponds to keeping none of the corresponding kind of jobs after they finish.

The config with 0 limits would look like:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/1 * * * *"
  successfulJobsHistoryLimit: 0
  failedJobsHistoryLimit: 0
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure
Stephen
  • 8,508
  • 12
  • 56
  • 96
JJC
  • 9,547
  • 8
  • 48
  • 53
  • 5
    Is there a way to set the time limit on history , like delete the successful jobs after a week ? – Kamaraju Mar 13 '18 at 19:36
  • Not that I know of, sorry. Do post a follow up here if you find a way. Presumably, I imagine you could write a cron job that would look at old pods timestamps and then one-by-one delete ones older than X days. – JJC Mar 16 '18 at 17:28
  • Yes i create a deployment in kubernetes , a golang project , created a channel to listen the list of pods and watch the change in the state .. – Kamaraju Mar 18 '18 at 05:35
  • 18
    Note that the linked answer only applies to `CronJob` objects (which the asker mentioned) but not `Job` objects. – Cory Klein Sep 25 '18 at 16:49
  • 2
    maybe have also a look [here](https://kubernetes.io/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically), looks like there is an possibility comming to define a ttlSecondsAfterFinished which "will delete the Job cascadingly, i.e. delete its dependent objects, such as Pods, together with the Job" – hilbert Jul 02 '20 at 06:44
  • Wouldn't "delete successful jobs after a week" simply be set by picking the appropriate `successfulJobsHistoryLimit`? For example `7` if the job is run once per day, `14` if it's run twice a day, and so on. – David Parks Aug 11 '21 at 15:57
  • Should also be noted that this only applies to jobs created automatically, in the normal process of running the cronjob chronologically, and not jobs created manually (ie. via `kubectl create job --from=cronjob/cronjob-template foojob`). The latter is intentionally not supported. – chb Feb 02 '22 at 02:51
40

This is possible from version 1.12 Alpha with ttlSecondsAfterFinished. An example from Clean Up Finished Jobs Automatically:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi-with-ttl
spec:
  ttlSecondsAfterFinished: 100
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
rath
  • 3,655
  • 1
  • 40
  • 53
  • 1
    `Note that this TTL mechanism is alpha, with feature gate TTLAfterFinished` I did not understand this feature gate part. – technazi Nov 27 '19 at 16:43
  • 2
    [Feature gates](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/) are flags that enable or disable Kubernetes functionality. I don't know how to set them up, or even if you have the ability to do so with a hosted service like EKS. I suspect you have to configure the master nodes as well, but I'm speculating. @technazi – rath Nov 27 '19 at 19:05
  • 1
    Thanks @rath! Yes while I am configuring the jobs and pods via helm templates, I don't see a place where I can configure the feature gates and hence I am not able to use the alpha improvements, essentially speaking `ttlSecondsAfterFinished` has no effect without the feature gates. – technazi Nov 27 '19 at 21:26
34

Another way using a field-selector:

kubectl delete jobs --field-selector status.successful=1 

That could be executed in a cronjob, similar to others answers.

  1. Create a service account, something like my-sa-name
  2. Create a role with list and delete permissions for the resource jobs
  3. Attach the role in the service account (rolebinding)
  4. Create a cronjob that will use the service account that will check for completed jobs and delete them
# 1. Create a service account

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-sa-name
  namespace: default

---

# 2. Create a role

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: my-completed-jobs-cleaner-role
rules:
- apiGroups: [""]
  resources: ["jobs"]
  verbs: ["list", "delete"]

---

# 3. Attach the role to the service account

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: my-completed-jobs-cleaner-rolebinding
  namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: my-completed-jobs-cleaner-role
subjects:
- kind: ServiceAccount
  name: my-sa-name
  namespace: default

---

# 4. Create a cronjob (with a crontab schedule) using the service account to check for completed jobs

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: jobs-cleanup
spec:
  schedule: "*/30 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: my-sa-name
          containers:
          - name: kubectl-container
            image: bitnami/kubectl:latest
            # I'm using bitnami kubectl, because the suggested kubectl image didn't had the `field-selector` option
            command: ["sh", "-c", "kubectl delete jobs --field-selector status.successful=1"]
          restartPolicy: Never

AndreDurao
  • 5,600
  • 7
  • 41
  • 61
23

I've found the below to work

To remove failed jobs:

kubectl delete job $(kubectl get jobs | awk '$3 ~ 0' | awk '{print $1}')

To remove completed jobs:

kubectl delete job $(kubectl get jobs | awk '$3 ~ 1' | awk '{print $1}')
justcompile
  • 3,362
  • 1
  • 29
  • 37
  • 5
    I had to update the command for it to work: `kubectl delete jobs $(kubectl get jobs | awk '$2 ~ 1/1' | awk '{print $1}')` – user2804197 Nov 03 '20 at 21:14
  • 1
    This one doesn't fail if there is no completed job to delete: `kubectl get jobs | awk '$2 ~ "1/1" {print $1}' | xargs kubectl delete job` – visit1985 Mar 26 '21 at 20:00
13

i'm using wernight/kubectl's kubectl image

scheduled a cron deleting anything that is

  • completed
  • 2 - 9 days old (so I have 2 days to review any failed jobs)

it runs every 30mins so i'm not accounting for jobs that are 10+ days old

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: cleanup
spec:
  schedule: "*/30 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: kubectl-runner
            image: wernight/kubectl
            command: ["sh", "-c", "kubectl get jobs | awk '$4 ~ /[2-9]d$/ || $3 ~ 1' | awk '{print $1}' | xargs kubectl delete job"]
          restartPolicy: Never
David Cheung
  • 1,628
  • 1
  • 13
  • 17
  • For your `awk` command, wouldn't you want your second condition to be `$2 ~ /^1/` instead of `$3 ~ 1` ? I assume you are looking at the completions column which is the second column, at least for me, and completions are printed like `0/1` or `1/1` so it's important to get the first character. Maybe your output for `kubectl get job` is different. – Stephen Aug 25 '20 at 12:32
  • You can also combine the two `awk` commands into one. I tested the following and it will work as a replacement for the awk component of the above: `awk '$4 ~ /^[2-9]d/ || $2 ~ /^1/ {print $1}'` – Stephen Aug 25 '20 at 12:39
  • does this require a cluster role binding to properly delete completed jobs? – Ryan Clemente Jan 25 '21 at 12:22
7

I recently built a kubernetes-operator to do this task.

After deploy it will monitor selected namespace and delete completed jobs/pods if they completed without errors/restarts.

https://github.com/lwolf/kube-cleanup-operator

lwolf
  • 960
  • 9
  • 16
  • 13
    Please don't just post some tool or library as an answer. At least demonstrate [how it solves the problem](http://meta.stackoverflow.com/a/251605) in the answer itself. – Baum mit Augen Sep 06 '17 at 10:50
6

Using jsonpath:

kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.succeeded==1)].metadata.name}')
Rajith
  • 101
  • 1
  • 2
5

As stated in the documentation "It is up to the user to delete old jobs", see http://kubernetes.io/docs/user-guide/jobs/#job-termination-and-cleanup

I would run a pod to do this cleanup based on job name and certain conditions, thus letting kubernetes at least take care of the availability of your process here. You could run a recurring job for this (assuming you run kubernetes 1.5).

Norbert
  • 6,026
  • 3
  • 17
  • 40
  • So what I dont understand about this, is that a pod to do the cleanup is now in the same namespace as the other pods, how do you configure it to connect to the cluster initially? – shan Mar 23 '21 at 18:05
  • The namespace is only relevant when you have your security setup in a very strict fashion (and with pods in k8s operating on pods, your security gets a bit weaker anyway). Luckily there is some progress: The number of jobs allowed to hang around has been increased (gcloud about 40k instead of previous 10k), and with cronjobs you can let k8s manage it for you by limiting the number of old jobs you keep – Norbert Mar 31 '21 at 16:02
4

A simple way to delete them by running a cron job:

kubectl get jobs --all-namespaces | sed '1d' | awk '{ print $2, "--namespace", $1 }' | while read line; do kubectl delete jobs $line; done
Daishi
  • 93
  • 2