Blog: Introducing Indexed Jobs

Author: Aldo Culquicondor (Google)

Once you have containerized a non-parallel Job,
it is quite easy to get it up and running on Kubernetes without modifications to
the binary. In most cases, when running parallel distributed Jobs, you had
to set a separate system to partition the work among the workers. For
example, you could set up a task queue to assign one work item to each
Pod
or multiple items
to each Pod until the queue is emptied
.

The Kubernetes 1.21 release introduces a new field to control Job completion mode,
a configuration option that allows you to control how Pod completions affect the
overall progress of a Job, with two possible options (for now):

  • NonIndexed (default): the Job is considered complete when there has been
    a number of successfully completed Pods equal to the specified number in
    .spec.completions. In other words, each Pod completion is homologous to
    each other. Any Job you might have created before the introduction of
    completion modes is implicitly NonIndexed.
  • Indexed: the Job is considered complete when there is one successfully
    completed Pod associated with each index from 0 to .spec.completions-1. The
    index is exposed to each Pod in the batch.kubernetes.io/job-completion-index
    annotation and the JOB_COMPLETION_INDEX environment variable.

You can start using Jobs with Indexed completion mode, or Indexed Jobs, for
short, to easily start parallel Jobs. Then, each worker Pod can have a statically
assigned partition of the data based on the index. This saves you from having to
set up a queuing system or even having to modify your binary!

Creating an Indexed Job

To create an Indexed Job, you just have to add completionMode: Indexed to the
Job spec and make use of the JOB_COMPLETION_INDEX environment variable.

apiVersion: batch/v1
kind: Job
metadata:
 name: 'sample-job'
spec:
 completions: 3
 parallelism: 3
 completionMode: Indexed
 template:
 spec:
 restartPolicy: Never
 containers:
 - command:
 - 'bash'
 - '-c'
 - 'echo "My partition: ${JOB_COMPLETION_INDEX}"'
 image: 'docker.io/library/bash'
 name: 'sample-load'

Note that completion mode is an alpha feature in the 1.21 release. To be able to
use it in your cluster, make sure to enable the IndexedJob feature
gate
on the
API server and
the controller manager.

When you run the example, you will see that each of the three created Pods gets a
different completion index. For the user’s convenience, the control plane sets the
JOB_COMPLETION_INDEX environment variable, but you can choose to set your
own

or expose the index as a file.

See Indexed Job for parallel processing with static work
assignment
for a
step-by-step guide, and a few more examples.

Future plans

SIG Apps envisions that there might be more completion modes that enable more
use cases for the Job API. We welcome you to open issues in
kubernetes/kubernetes with your
suggestions.

In particular, we are considering an IndexedAndUnique mode where the indexes
are not just available as annotation, but they are part of the Pod names,
similar to StatefulSet.
This should facilitate inter-Pod communication for tightly coupled Pods.
You can join the discussion in the open issue.

Wrap-up

Indexed Jobs allows you to statically partition work among the workers of your
parallel Jobs. SIG Apps hopes that this feature facilitates the migration of
more batch workloads to Kubernetes.

Originally posted on Kubernetes – Production-Grade Container Orchestration
Author:

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *