Jobs

On this page, we will see the following topics in detail with examples.

PlainJob
completion
parallelism
backofflimit
activeDeadlineSeconds
ttlSecondsAfterFinished
podFailurePolicy # it is available in 1.29 beta

What is a Job in the Kubernetes?

From Kubernetes.io, document. Jobs represent one-off tasks that run to completion and then stop. A Job creates one or more Pods and will continue to retry execution of the Pods until a specified number of them successfully terminate. As pods successfully complete, the Job tracks the successful completions.

Throughout all the jobs concepts, I am going to use the busybox image for the demonstration.

1. Plain Job

jobsdemo.yaml

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["echo", "Welcome to my Blog"]

restartPolicy: Never

kubectl create -f jobsdemo.yaml

job.batch/jobdemo created

kubectl get jobs

NAME COMPLETIONS DURATION AGE

jobdemo 1/1 9s 16s

kubectl get pods

NAME READY STATUS RESTARTS AGE

jobdemo-fb2qs 0/1 Completed 0 33m

kubectl logs jobdemo-fb2qs

Welcome to my Blog

kubectl delete jobs jobdemo

Note: when you delete job, it will automatically delete the corresponding pods

2. COMPLETIONS

In the above yaml, the job runs just one time and it became the completed state. Let us assume that, I need to run "echo Welcome to my Blog" for more than once, then I need to use "completions: <n>" under the spec as below. For, example, if you give completions: 3, then it will run three times by three pods.

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

completions: 3

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["echo", "Welcome to my Blog"]

restartPolicy: Never

kubectl get pods

NAME READY STATUS RESTARTS AGE

jobdemo-jphs6 0/1 Completed 0 4m17s

jobdemo-vmnjx 0/1 Completed 0 4m23s

jobdemo-wxxz2 0/1 Completed 0 4m38s

kubectl get jobs

NAME COMPLETIONS DURATION AGE

jobdemo 3/3 26s 4m42s

3. Parallelism

In the above example, each pod will be running sequentially. If incase, if you want to speed up the process, where you can bring in the parallelism. If you give, parallelism: 2 and completion: 4, which means, your job will run 4 times and each 2 jobs will run in parallel.

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

completions: 4

parallelism: 2

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["echo", "Welcome to my Blog"]

restartPolicy: Never

kubectl get pods

NAME READY STATUS RESTARTS AGE

jobdemo-7qlvv 0/1 Completed 0 76s

jobdemo-g44ht 0/1 Completed 0 82s

jobdemo-r7tv9 0/1 Completed 0 82s

jobdemo-tkwx8 0/1 Completed 0 75s

kubectl describe pods jobdemo-7qlvv | grep Started

Started: Thu, 28 Dec 2023 07:26:29 -0800

Normal Started 117s kubelet Started container simplejob

kubectl describe pods jobdemo-tkwx8 | grep Started

Started: Thu, 28 Dec 2023 07:26:30 -0800

Normal Started 2m22s kubelet Started container simplejob

kubectl describe pods jobdemo-g44ht | grep Started

Started: Thu, 28 Dec 2023 07:26:18 -0800

Normal Started 2m59s kubelet Started container simplejob

kubectl describe pods jobdemo-r7tv9 | grep Started

Started: Thu, 28 Dec 2023 07:26:19 -0800

Normal Started 3m21s kubelet Started container simplejob

4. BackoffLimit

Let us assume, that you have mistakenly gave the image version that does not even exist in the container registry(docker hub), then it will try to download the image by 6 times. Now, you really don't want to wait for 6 times, you just want to try only for a couple of times, then you can minimize the backoffLimit: 2. You can see that, once it reaches the limit, it did not even try for the next time.

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

#parallelism: 2

#completions: 4

backoffLimit: 2 # It will try for a couple of times

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["ls", "/nonexistdirectory"] # This directory not existing, so it will fail

restartPolicy: Never

kubectl get all

NAME READY STATUS RESTARTS AGE

pod/jobdemo-4hfl7 0/1 Error 0 3m53s

pod/jobdemo-jkc8t 0/1 Error 0 4m29s

pod/jobdemo-wsmst 0/1 Error 0 4m15s

kubectl describe job jobdemo

Name: jobdemo

Annotations: <none>

Parallelism: 1

Completions: 1

Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 3 Failed

Pod Template:

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal SuccessfulCreate 108s job-controller Created pod: jobdemo-jkc8t

Normal SuccessfulCreate 94s job-controller Created pod: jobdemo-wsmst

Normal SuccessfulCreate 72s job-controller Created pod: jobdemo-4hfl7

Warning BackoffLimitExceeded 67s job-controller Job has reached the specified backoff limit

5. activeDeadlineSeconds

You can also set a condition to run a job only for a specific time. If the job cannot be completed within that time period, then it will automatically stop running it. In the below, example, even though I gave backoffLimit 4, the activeDeadlineSeconds takes precedence. It means that it won't even wait that long to try four times, as soon as it reaches 15 seconds, it will stop abruptly.

cat jobsdemo.yaml

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

#parallelism: 2

#completions: 4

backoffLimit: 4

activeDeadlineSeconds: 15

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["ls", "/nonexistdirectory"]

restartPolicy: Never

kubectl describe job jobdemo

Parallelism: 1

Completions: 1

Completion Mode: NonIndexed

Start Time: Thu, 28 Dec 2023 08:24:36 -0800

Active Deadline Seconds: 15s

Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 2 Failed# Note that it is not 4

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Normal SuccessfulCreate 44s job-controller Created pod: jobdemo-lct5w

Normal SuccessfulCreate 31s job-controller Created pod: jobdemo-bzbcd

Normal SuccessfulDelete 29s job-controller Deleted pod: jobdemo-bzbcd

Warning DeadlineExceeded 29s job-controller Job was active longer than specified deadline

6. ttlSecondsAfterFinished

This job will delete automatically once it reaches the time range, in the below example, the job and their corresponding pods will be deleted irrespective of job status whether it is completed or failed.

cat jobsdemo.yaml

apiVersion: batch/v1

kind: Job

metadata:

name: jobdemo

spec:

ttlSecondsAfterFinished: 30 # The job jobdemo, will automatically delete after 30 secs

template:

spec:

containers:

- name: simplejob

image: busybox

command: ["echo", "Welcome to my Blog"]

# command: ["ls", "/nonexistdirectory"]

restartPolicy: Never