Blog: Writing a Controller for Pod Labels

Authors: Arthur Busser (Padok)

Operators are proving to be an excellent solution to
running stateful distributed applications in Kubernetes. Open source tools like
the Operator SDK provide ways to build reliable and maintainable
operators, making it easier to extend Kubernetes and implement custom
scheduling.

Kubernetes operators run complex software inside your cluster. The open source
community has already built many operators for distributed
applications like Prometheus, Elasticsearch, or Argo CD. Even outside of
open source, operators can help to bring new functionality to your Kubernetes
cluster.

An operator is a set of custom resources and a
set of controllers. A controller watches for changes to specific
resources in the Kubernetes API and reacts by creating, updating, or deleting
resources.

The Operator SDK is best suited for building fully-featured operators.
Nonetheless, you can use it to write a single controller. This post will walk
you through writing a Kubernetes controller in Go that will add a pod-name
label to pods that have a specific annotation.

Why do we need a controller for this?

I recently worked on a project where we needed to create a Service that routed
traffic to a specific Pod in a ReplicaSet. The problem is that a Service can
only select pods by label, and all pods in a ReplicaSet have the same labels.
There are two ways to solve this problem:

Create a Service without a selector and manage the Endpoints or
EndpointSlices for that Service directly. We would need to write a custom
controller to insert our Pod’s IP address into those resources.
Add a label to the Pod with a unique value. We could then use this label in
our Service’s selector. Again, we would need to write a custom controller to
add this label.

A controller is a control loop that tracks one or more Kubernetes resource
types. The controller from option n°2 above only needs to track pods, which
makes it simpler to implement. This is the option we are going to walk through
by writing a Kubernetes controller that adds a pod-name label to our pods.

StatefulSets do this natively by adding a
pod-name label to each Pod in the set. But what if we don’t want to or can’t
use StatefulSets?

We rarely create pods directly; most often, we use a Deployment, ReplicaSet, or
another high-level resource. We can specify labels to add to each Pod in the
PodSpec, but not with dynamic values, so no way to replicate a StatefulSet’s
pod-name label.

We tried using a mutating admission webhook. When
anyone creates a Pod, the webhook patches the Pod with a label containing the
Pod’s name. Disappointingly, this does not work: not all pods have a name before
being created. For instance, when the ReplicaSet controller creates a Pod, it
sends a namePrefix to the Kubernetes API server and not a name. The API
server generates a unique name before persisting the new Pod to etcd, but only
after calling our admission webhook. So in most cases, we can’t know a Pod’s
name with a mutating webhook.

Once a Pod exists in the Kubernetes API, it is mostly immutable, but we can
still add a label. We can even do so from the command line:

kubectl label my-pod my-label-key=my-label-value

We need to watch for changes to any pods in the Kubernetes API and add the label
we want. Rather than do this manually, we are going to write a controller that
does it for us.

Bootstrapping a controller with the Operator SDK

A controller is a reconciliation loop that reads the desired state of a resource
from the Kubernetes API and takes action to bring the cluster’s actual state
closer to the desired state.

In order to write this controller as quickly as possible, we are going to use
the Operator SDK. If you don’t have it installed, follow the
official documentation.

$ operator-sdk version
operator-sdk version: "v1.4.2", commit: "4b083393be65589358b3e0416573df04f4ae8d9b", kubernetes version: "v1.19.4", go version: "go1.15.8", GOOS: "darwin", GOARCH: "amd64"

Let’s create a new directory to write our controller in:

mkdir label-operator && cd label-operator

Next, let’s initialize a new operator, to which we will add a single controller.
To do this, you will need to specify a domain and a repository. The domain
serves as a prefix for the group your custom Kubernetes resources will belong
to. Because we are not going to be defining custom resources, the domain does
not matter. The repository is going to be the name of the Go module we are going
to write. By convention, this is the repository where you will be storing your
code.

As an example, here is the command I ran:

# Feel free to change the domain and repo values.
operator-sdk init --domain=padok.fr --repo=github.com/busser/label-operator

Next, we need a create a new controller. This controller will handle pods and
not a custom resource, so no need to generate the resource code. Let’s run this
command to scaffold the code we need:

operator-sdk create api --group=core --version=v1 --kind=Pod --controller=true --resource=false

We now have a new file: controllers/pod_controller.go. This file contains a
PodReconciler type with two methods that we need to implement. The first is
Reconcile, and it looks like this for now:

func (r *PodReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    _ = r.Log.WithValues("pod", req.NamespacedName)

    // your logic here

    return ctrl.Result{}, nil
}

The Reconcile method is called whenever a Pod is created, updated, or deleted.
The name and namespace of the Pod are in the ctrl.Request the method receives
as a parameter.

The second method is SetupWithManager and for now it looks like this:

func (r *PodReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        // Uncomment the following line adding a pointer to an instance of the controlled resource as an argument
        // For().
        Complete(r)
}

The SetupWithManager method is called when the operator starts. It serves to
tell the operator framework what types our PodReconciler needs to watch. To
use the same Pod type used by Kubernetes internally, we need to import some of
its code. All of the Kubernetes source code is open source, so you can import
any part you like in your own Go code. You can find a complete list of available
packages in the Kubernetes source code or here on pkg.go.dev. To
use pods, we need the k8s.io/api/core/v1 package.

package controllers

import (
    // other imports...
    corev1 "k8s.io/api/core/v1"
    // other imports...
)

Lets use the Pod type in SetupWithManager to tell the operator framework we
want to watch pods:

func (r *PodReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&corev1.Pod{}).
        Complete(r)
}

Before moving on, we should set the RBAC permissions our controller needs. Above
the Reconcile method, we have some default permissions:

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=pods/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=core,resources=pods/finalizers,verbs=update

We don’t need all of those. Our controller will never interact with a Pod’s
status or its finalizers. It only needs to read and update pods. Lets remove the
unnecessary permissions and keep only what we need:

// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;update;patch

We are now ready to write our controller’s reconciliation logic.

Implementing reconciliation

Here is what we want our Reconcile method to do:

Use the Pod’s name and namespace from the ctrl.Request to fetch the Pod
from the Kubernetes API.
If the Pod has an add-pod-name-label annotation, add a pod-name label to
the Pod; if the annotation is missing, don’t add the label.
Update the Pod in the Kubernetes API to persist the changes made.

Lets define some constants for the annotation and label:

const (
    addPodNameLabelAnnotation = "padok.fr/add-pod-name-label"
    podNameLabel              = "padok.fr/pod-name"
)

The first step in our reconciliation function is to fetch the Pod we are working
on from the Kubernetes API:

// Reconcile handles a reconciliation request for a Pod.
// If the Pod has the addPodNameLabelAnnotation annotation, then Reconcile
// will make sure the podNameLabel label is present with the correct value.
// If the annotation is absent, then Reconcile will make sure the label is too.
func (r *PodReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := r.Log.WithValues("pod", req.NamespacedName)

    /*
        Step 0: Fetch the Pod from the Kubernetes API.
    */

    var pod corev1.Pod
    if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
        log.Error(err, "unable to fetch Pod")
        return ctrl.Result{}, err
    }

    return ctrl.Result{}, nil
}

Our Reconcile method will be called when a Pod is created, updated, or
deleted. In the deletion case, our call to r.Get will return a specific error.
Let’s import the package that defines this error:

package controllers

import (
    // other imports...
    apierrors "k8s.io/apimachinery/pkg/api/errors"
    // other imports...
)

We can now handle this specific error and — since our controller does not care
about deleted pods — explicitly ignore it:

    /*
        Step 0: Fetch the Pod from the Kubernetes API.
    */

    var pod corev1.Pod
    if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
        if apierrors.IsNotFound(err) {
            // we'll ignore not-found errors, since we can get them on deleted requests.
            return ctrl.Result{}, nil
        }
        log.Error(err, "unable to fetch Pod")
        return ctrl.Result{}, err
    }

Next, lets edit our Pod so that our dynamic label is present if and only if our
annotation is present:

    /*
        Step 1: Add or remove the label.
    */

    labelShouldBePresent := pod.Annotations[addPodNameLabelAnnotation] == "true"
    labelIsPresent := pod.Labels[podNameLabel] == pod.Name

    if labelShouldBePresent == labelIsPresent {
        // The desired state and actual state of the Pod are the same.
        // No further action is required by the operator at this moment.
        log.Info("no update required")
        return ctrl.Result{}, nil
    }

    if labelShouldBePresent {
        // If the label should be set but is not, set it.
        if pod.Labels == nil {
            pod.Labels = make(map[string]string)
        }
        pod.Labels[podNameLabel] = pod.Name
        log.Info("adding label")
    } else {
        // If the label should not be set but is, remove it.
        delete(pod.Labels, podNameLabel)
        log.Info("removing label")
    }

Finally, let’s push our updated Pod to the Kubernetes API:

    /*
        Step 2: Update the Pod in the Kubernetes API.
    */

    if err := r.Update(ctx, &pod); err != nil {
        log.Error(err, "unable to update Pod")
        return ctrl.Result{}, err
    }

When writing our updated Pod to the Kubernetes API, there is a risk that the Pod
has been updated or deleted since we first read it. When writing a Kubernetes
controller, we should keep in mind that we are not the only actors in the
cluster. When this happens, the best thing to do is start the reconciliation
from scratch, by requeuing the event. Lets do exactly that:

    /*
        Step 2: Update the Pod in the Kubernetes API.
    */

    if err := r.Update(ctx, &pod); err != nil {
        if apierrors.IsConflict(err) {
            // The Pod has been updated since we read it.
            // Requeue the Pod to try to reconciliate again.
            return ctrl.Result{Requeue: true}, nil
        }
        if apierrors.IsNotFound(err) {
            // The Pod has been deleted since we read it.
            // Requeue the Pod to try to reconciliate again.
            return ctrl.Result{Requeue: true}, nil
        }
        log.Error(err, "unable to update Pod")
        return ctrl.Result{}, err
    }

Let’s remember to return successfully at the end of the method:

    return ctrl.Result{}, nil
}

And that’s it! We are now ready to run the controller on our cluster.

Run the controller on your cluster

To run our controller on your cluster, we need to run the operator. For that,
all you will need is kubectl. If you don’t have a Kubernetes cluster at hand,
I recommend you start one locally with KinD (Kubernetes in Docker).

All it takes to run the operator from your machine is this command:

make run

After a few seconds, you should see the operator’s logs. Notice that our
controller’s Reconcile method was called for all pods already running in the
cluster.

Let’s keep the operator running and, in another terminal, create a new Pod:

kubectl run --image=nginx my-nginx

The operator should quickly print some logs, indicating that it reacted to the
Pod’s creation and subsequent changes in status:

INFO    controllers.Pod no update required  {"pod": "default/my-nginx"}
INFO    controllers.Pod no update required  {"pod": "default/my-nginx"}
INFO    controllers.Pod no update required  {"pod": "default/my-nginx"}
INFO    controllers.Pod no update required  {"pod": "default/my-nginx"}

Lets check the Pod’s labels:

$ kubectl get pod my-nginx --show-labels
NAME       READY   STATUS    RESTARTS   AGE   LABELS
my-nginx   1/1     Running   0          11m   run=my-nginx

Let’s add an annotation to the Pod so that our controller knows to add our
dynamic label to it:

kubectl annotate pod my-nginx padok.fr/add-pod-name-label=true

Notice that the controller immediately reacted and produced a new line in its
logs:

INFO    controllers.Pod adding label    {"pod": "default/my-nginx"}

$ kubectl get pod my-nginx --show-labels
NAME       READY   STATUS    RESTARTS   AGE   LABELS
my-nginx   1/1     Running   0          13m   padok.fr/pod-name=my-nginx,run=my-nginx

Bravo! You just successfully wrote a Kubernetes controller capable of adding
labels with dynamic values to resources in your cluster.

Controllers and operators, both big and small, can be an important part of your
Kubernetes journey. Writing operators is easier now than it has ever been. The
possibilities are endless.

What next?

If you want to go further, I recommend starting by deploying your controller or
operator inside a cluster. The Makefile generated by the Operator SDK will do
most of the work.

When deploying an operator to production, it is always a good idea to implement
robust testing. The first step in that direction is to write unit tests.
This documentation will guide you in writing tests for
your operator. I wrote tests for the operator we just wrote; you can find all of
my code in this GitHub repository.

How to learn more?

The Operator SDK documentation goes into detail on how you
can go further and implement more complex operators.

When modeling a more complex use-case, a single controller acting on built-in
Kubernetes types may not be enough. You may need to build a more complex
operator with Custom Resource Definitions (CRDs)
and multiple controllers. The Operator SDK is a great tool to help you do this.

If you want to discuss building an operator, join the #kubernetes-operator
channel in the Kubernetes Slack workspace!

Originally posted on Kubernetes – Production-Grade Container Orchestration
Author: