DIY: Create Your Own Cloud with Kubernetes (Part 3)

Author: Andrei Kvapil (Ænix)

Approaching the most interesting phase, this article delves into running Kubernetes within
Kubernetes. Technologies such as Kamaji and Cluster API are highlighted, along with their
integration with KubeVirt.

Previous discussions have covered
preparing Kubernetes on bare metal
and
how to turn Kubernetes into virtual machines management system.
This article concludes the series by explaining how, using all of the above, you can build a
full-fledged managed Kubernetes and run virtual Kubernetes clusters with just a click.

First up, let’s dive into the Cluster API.

Cluster API

Cluster API is an extension for Kubernetes that allows the management of Kubernetes clusters as
custom resources within another Kubernetes cluster.

The main goal of the Cluster API is to provide a unified interface for describing the basic
entities of a Kubernetes cluster and managing their lifecycle. This enables the automation of
processes for creating, updating, and deleting clusters, simplifying scaling, and infrastructure
management.

Within the context of Cluster API, there are two terms: management cluster and
tenant clusters.

Management cluster is a Kubernetes cluster used to deploy and manage other clusters.
This cluster contains all the necessary Cluster API components and is responsible for describing,
creating, and updating tenant clusters. It is often used just for this purpose.
Tenant clusters are the user clusters or clusters deployed using the Cluster API. They are
created by describing the relevant resources in the management cluster. They are then used for
deploying applications and services by end-users.

It’s important to understand that physically, tenant clusters do not necessarily have to run on
the same infrastructure with the management cluster; more often, they are running elsewhere.

A diagram showing interaction of management Kubernetes cluster and tenant Kubernetes clusters using Cluster API

For its operation, Cluster API utilizes the concept of providers which are separate controllers
responsible for specific components of the cluster being created. Within Cluster API, there are
several types of providers. The major ones are:

Infrastructure Provider, which is responsible for providing the computing infrastructure, such as virtual machines or physical servers.
Control Plane Provider, which provides the Kubernetes control plane, namely the components kube-apiserver, kube-scheduler, and kube-controller-manager.
Bootstrap Provider, which is used for generating cloud-init configuration for the virtual machines and servers being created.

To get started, you will need to install the Cluster API itself and one provider of each type.
You can find a complete list of supported providers in the project’s
documentation.

For installation, you can use the clusterctl utility, or
Cluster API Operator
as the more declarative method.

Choosing providers

Infrastructure provider

To run Kubernetes clusters using KubeVirt, the
KubeVirt Infrastructure Provider
must be installed.
It enables the deployment of virtual machines for worker nodes in the same management cluster, where
the Cluster API operates.

Control plane provider

The Kamaji project offers a ready solution for running the
Kubernetes control plane for tenant clusters as containers within the management cluster.
This approach has several significant advantages:

Cost-effectiveness: Running the control plane in containers avoids the use of separate control
plane nodes for each cluster, thereby significantly reducing infrastructure costs.
Stability: Simplifying architecture by eliminating complex multi-layered deployment schemes.
Instead of sequentially launching a virtual machine and then installing etcd and Kubernetes components
inside it, there’s a simple control plane that is deployed and run as a regular application inside
Kubernetes and managed by an operator.
Security: The cluster’s control plane is hidden from the end user, reducing the possibility
of its components being compromised, and also eliminates user access to the cluster’s certificate
store. This approach to organizing a control plane invisible to the user is often used by cloud providers.

Bootstrap provider

Kubeadm as the Bootstrap
Provider – as the standard method for preparing clusters in Cluster API. This provider is developed
as part of the Cluster API itself. It requires only a prepared system image with kubelet and kubeadm
installed and allows generating configs in the cloud-init and ignition formats.

It’s worth noting that Talos Linux also supports provisioning via the Cluster API and
has
providers for this.
Although previous articles
discussed using Talos Linux to set up a management cluster on bare-metal nodes, to provision tenant
clusters the Kamaji+Kubeadm approach has more advantages.
It facilitates the deployment of Kubernetes control planes in containers, thus removing the need for
separate virtual machines for control plane instances. This simplifies the management and reduces costs.

How it works

The primary object in Cluster API is the Cluster resource, which acts as the parent for all the others.
Typically, this resource references two others: a resource describing the control plane and a
resource describing the infrastructure, each managed by a separate provider.

Unlike the Cluster, these two resources are not standardized, and their kind depends on the specific
provider you are using:

A diagram showing the relationship of a Cluster resource and the resources it links to in Cluster API

Within Cluster API, there is also a resource named MachineDeployment, which describes a group of nodes,
whether they are physical servers or virtual machines. This resource functions similarly to standard
Kubernetes resources such as Deployment, ReplicaSet, and Pod, providing a mechanism for the
declarative description of a group of nodes and automatic scaling.

In other words, the MachineDeployment resource allows you to declaratively describe nodes for your
cluster, automating their creation, deletion, and updating according to specified parameters and
the requested number of replicas.

A diagram showing the relationship of a Cluster resource and its children in Cluster API — A diagram showing the relationship of a MachineDeployment resource and its children in Cluster API

To create machines, MachineDeployment refers to a template for generating the machine itself and a
template for generating its cloud-init config:

To deploy a new Kubernetes cluster using Cluster API, you will need to prepare the following set of resources:

A general Cluster resource
A KamajiControlPlane resource, responsible for the control plane operated by Kamaji
A KubevirtCluster resource, describing the cluster configuration in KubeVirt
A KubevirtMachineTemplate resource, responsible for the virtual machine template
A KubeadmConfigTemplate resource, responsible for generating tokens and cloud-init
At least one MachineDeployment to create some workers

Polishing the cluster

In most cases, this is sufficient, but depending on the providers used, you may need other resources
as well. You can find examples of the resources created for each type of provider in the
Kamaji project documentation.

At this stage, you already have a ready tenant Kubernetes cluster, but so far, it contains nothing
but API workers and a few core plugins that are standardly included in the installation of any
Kubernetes cluster: kube-proxy and CoreDNS. For full integration, you will need to install
several more components:

To install additional components, you can use a separate
Cluster API Add-on Provider for Helm,
or the same FluxCD discussed in
previous articles.

When creating resources in FluxCD, it’s possible to specify the target cluster by referring to the
kubeconfig generated by Cluster API. Then, the installation will be performed directly into it.
Thus, FluxCD becomes a universal tool for managing resources both in the management cluster and
in the user tenant clusters.

A diagram showing the interaction scheme of fluxcd, which can install components in both management and tenant Kubernetes clusters

What components are being discussed here? Generally, the set includes the following:

CNI Plugin

To ensure communication between pods in a tenant Kubernetes cluster, it’s necessary to deploy a
CNI plugin. This plugin creates a virtual network that allows pods to interact with each other
and is traditionally deployed as a Daemonset on the cluster’s worker nodes. You can choose and
install any CNI plugin that you find suitable.

Cloud Controller Manager

The main task of the Cloud Controller Manager (CCM) is to integrate Kubernetes with the cloud
infrastructure provider’s environment (in your case, it is the management Kubernetes cluster
in which all worksers of tenant Kubernetes are provisioned). Here are some tasks it performs:

When a service of type LoadBalancer is created, the CCM initiates the process of creating a cloud load balancer, which directs traffic to your Kubernetes cluster.
If a node is removed from the cloud infrastructure, the CCM ensures its removal from your cluster as well, maintaining the cluster’s current state.
When using the CCM, nodes are added to the cluster with a special taint, node.cloudprovider.kubernetes.io/uninitialized,
which allows for the processing of additional business logic if necessary. After successful initialization, this taint is removed from the node.

Depending on the cloud provider, the CCM can operate both inside and outside the tenant cluster.

The KubeVirt Cloud Provider is designed
to be installed in the external parent management cluster. Thus, creating services of type
LoadBalancer in the tenant cluster initiates the creation of LoadBalancer services in the parent
cluster, which direct traffic into the tenant cluster.

A diagram showing a Cloud Controller Manager installed outside of a tenant Kubernetes cluster on a scheme of nested Kubernetes clusters and the mapping of services it manages from the parent to the child Kubernetes cluster

CSI Driver

The Container Storage Interface (CSI) is divided into two main parts for interacting with storage
in Kubernetes:

csi-controller: This component is responsible for interacting with the cloud provider’s API
to create, delete, attach, detach, and resize volumes.
csi-node: This component runs on each node and facilitates the mounting of volumes to pods
as requested by kubelet.

In the context of using the KubeVirt CSI Driver, a unique
opportunity arises. Since virtual machines in KubeVirt runs within the management Kubernetes cluster,
where a full-fledged Kubernetes API is available, this opens the path for running the csi-controller
outside of the user’s tenant cluster. This approach is popular in the KubeVirt community and offers
several key advantages:

Security: This method hides the internal cloud API from the end-user, providing access to
resources exclusively through the Kubernetes interface. Thus, it reduces the risk of direct access
to the management cluster from user clusters.
Simplicity and Convenience: Users don’t need to manage additional controllers in their clusters,
simplifying the architecture and reducing the management burden.

However, the CSI-node must necessarily run inside the tenant cluster, as it directly interacts with
kubelet on each node. This component is responsible for the mounting and unmounting of volumes into pods,
requiring close integration with processes occurring directly on the cluster nodes.

The KubeVirt CSI Driver acts as a proxy for ordering volumes. When a PVC is created inside the tenant
cluster, a PVC is created in the management cluster, and then the created PV is connected to the
virtual machine.

A diagram showing a CSI plugin components installed on both inside and outside of a tenant Kubernetes cluster on a scheme of nested Kubernetes clusters and the mapping of persistent volumes it manages from the parent to the child Kubernetes cluster

Cluster Autoscaler

The Cluster Autoscaler is a versatile component that
can work with various cloud APIs, and its integration with Cluster-API is just one of the available
functions. For proper configuration, it requires access to two clusters: the tenant cluster, to
track pods and determine the need for adding new nodes, and the managing Kubernetes cluster
(management kubernetes cluster), where it interacts with the MachineDeployment resource and adjusts
the number of replicas.

Although Cluster Autoscaler usually runs inside the tenant Kubernetes cluster, in this situation,
it is suggested to install it outside for the same reasons described before. This approach is
simpler to maintain and more secure as it prevents users of tenant clusters from accessing the
management API of the management cluster.

A diagram showing a Cloud Controller Manager installed outside of a tenant Kubernetes cluster on a scheme of nested Kubernetes clusters — A diagram showing a Cluster Autoscaler installed outside of a tenant Kubernetes cluster on a scheme of nested Kubernetes clusters

Konnectivity

There’s another additional component I’d like to mention –
Konnectivity.
You will likely need it later on to get webhooks and the API aggregation layer working in your
tenant Kubernetes cluster. This topic is covered in detail in one of my
previous article.

Unlike the components presented above, Kamaji allows you to easily enable Konnectivity and manage
it as one of the core components of your tenant cluster, alongside kube-proxy and CoreDNS.

Conclusion

Now you have a fully functional Kubernetes cluster with the capability for dynamic scaling, automatic
provisioning of volumes, and load balancers.

Going forward, you might consider metrics and logs collection from your tenant clusters, but that
goes beyond the scope of this article.

Of course, all the components necessary for deploying a Kubernetes cluster can be packaged into a
single Helm chart and deployed as a unified application. This is precisely how we organize the
deployment of managed Kubernetes clusters with the click of a button on our open PaaS platform,
Cozystack, where you can try all the technologies described in the article
for free.

Originally posted on Kubernetes Blog
Author: