Jobs
On this page, we will see the following topics in detail with examples.
- PlainJob
- completion
- parallelism
- backofflimit
- activeDeadlineSeconds
- ttlSecondsAfterFinished
- podFailurePolicy # it is available in 1.29 beta
What is a Job in the Kubernetes?
From Kubernetes.io, document. Jobs represent one-off tasks that run to completion and then stop. A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions.
Throughout all the jobs concepts, I am going to use the busybox image for the demonstration.
1. Plain Job
jobsdemo.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["echo", "Welcome to my Blog"]
restartPolicy: Never
kubectl create -f jobsdemo.yaml
job.batch/jobdemo created
kubectl get jobs
NAME COMPLETIONS DURATION AGE
jobdemo 1/1 9s 16s
kubectl get pods
NAME READY STATUS RESTARTS AGE
jobdemo-fb2qs 0/1 Completed 0 33m
kubectl logs jobdemo-fb2qs
Welcome to my Blog
kubectl delete jobs jobdemo
Note: when you delete job, it will automatically delete the corresponding pods
2. COMPLETIONS
In the above yaml, the job runs just one time and it became the completed state. Let us assume that, I need to run "echo Welcome to my Blog" for more than once, then I need to use "completions: <n>" under the spec as below. For, example, if you give completions: 3, then it will run three times by three pods.
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
completions: 3
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["echo", "Welcome to my Blog"]
restartPolicy: Never
kubectl get pods
NAME READY STATUS RESTARTS AGE
jobdemo-jphs6 0/1 Completed 0 4m17s
jobdemo-vmnjx 0/1 Completed 0 4m23s
jobdemo-wxxz2 0/1 Completed 0 4m38s
kubectl get jobs
NAME COMPLETIONS DURATION AGE
jobdemo 3/3 26s 4m42s
3. Parallelism
In the above example, each pod will be running sequentially. If incase, if you want to speed up the process, where you can bring in the parallelism. If you give, parallelism: 2 and completion: 4, which means, your job will run 4 times and each 2 jobs will run in parallel.
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
completions: 4
parallelism: 2
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["echo", "Welcome to my Blog"]
restartPolicy: Never
kubectl get pods
NAME READY STATUS RESTARTS AGE
jobdemo-7qlvv 0/1 Completed 0 76s
jobdemo-g44ht 0/1 Completed 0 82s
jobdemo-r7tv9 0/1 Completed 0 82s
jobdemo-tkwx8 0/1 Completed 0 75s
kubectl describe pods jobdemo-7qlvv | grep Started
Started: Thu, 28 Dec 2023 07:26:29 -0800
Normal Started 117s kubelet Started container simplejob
kubectl describe pods jobdemo-tkwx8 | grep Started
Started: Thu, 28 Dec 2023 07:26:30 -0800
Normal Started 2m22s kubelet Started container simplejob
kubectl describe pods jobdemo-g44ht | grep Started
Started: Thu, 28 Dec 2023 07:26:18 -0800
Normal Started 2m59s kubelet Started container simplejob
kubectl describe pods jobdemo-r7tv9 | grep Started
Started: Thu, 28 Dec 2023 07:26:19 -0800
Normal Started 3m21s kubelet Started container simplejob
4. BackoffLimit
Let us assume, that you have mistakenly gave the image version that does not even exist in the container registry(docker hub), then it will try to download the image by 6 times. Now, you really don't want to wait for 6 times, you just want to try only for a couple of times, then you can minimize the backoffLimit: 2. You can see that, once it reaches the limit, it did not even try for the next time.
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
#parallelism: 2
#completions: 4
backoffLimit: 2 # It will try for a couple of times
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["ls", "/nonexistdirectory"] # This directory not existing, so it will fail
restartPolicy: Never
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/jobdemo-4hfl7 0/1 Error 0 3m53s
pod/jobdemo-jkc8t 0/1 Error 0 4m29s
pod/jobdemo-wsmst 0/1 Error 0 4m15s
kubectl describe job jobdemo
Name: jobdemo
Annotations: <none>
Parallelism: 1
Completions: 1
Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 3 Failed
Pod Template:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 108s job-controller Created pod: jobdemo-jkc8t
Normal SuccessfulCreate 94s job-controller Created pod: jobdemo-wsmst
Normal SuccessfulCreate 72s job-controller Created pod: jobdemo-4hfl7
Warning BackoffLimitExceeded 67s job-controller Job has reached the specified backoff limit
5. activeDeadlineSeconds
You can also set a condition to run a job only for a specific time. If the job cannot be completed within that time period, then it will automatically stop running it. In the below, example, even though I gave backoffLimit 4, the activeDeadlineSeconds takes precedence. It means that it won't even wait that long to try four times, as soon as it reaches 15 seconds, it will stop abruptly.
cat jobsdemo.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
#parallelism: 2
#completions: 4
backoffLimit: 4
activeDeadlineSeconds: 15
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["ls", "/nonexistdirectory"]
restartPolicy: Never
kubectl describe job jobdemo
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Thu, 28 Dec 2023 08:24:36 -0800
Active Deadline Seconds: 15s
Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 2 Failed# Note that it is not 4
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 44s job-controller Created pod: jobdemo-lct5w
Normal SuccessfulCreate 31s job-controller Created pod: jobdemo-bzbcd
Normal SuccessfulDelete 29s job-controller Deleted pod: jobdemo-bzbcd
Warning DeadlineExceeded 29s job-controller Job was active longer than specified deadline
6. ttlSecondsAfterFinished
This job will delete automatically once it reaches the time range, in the below example, the job and their corresponding pods will be deleted irrespective of job status whether it is completed or failed.
cat jobsdemo.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: jobdemo
spec:
ttlSecondsAfterFinished: 30 # The job jobdemo, will automatically delete after 30 secs
template:
spec:
containers:
- name: simplejob
image: busybox
command: ["echo", "Welcome to my Blog"]
# command: ["ls", "/nonexistdirectory"]
restartPolicy: Never