Kubernetes Operators with Ansible Deep Dive: Part 1

blog_ansible-and-kubernetes-deep-dive-1

Deploying applications on Red Hat OpenShift or Kubernetes has come a long way. These days, it’s relatively easy to use OpenShift’s GUI or something like Helm to deploy applications with minimal effort. Unfortunately, these tools don’t typically address the needs of operations teams tasked with maintaining the health or scalability of the application – especially if the deployed application is something stateful like a database. This is where Operators come in.

An Operator is a method of packaging, deploying and managing a Kubernetes application.  Kubernetes Operators with Ansible exists to help you encode the operational knowledge of your application in Ansible.

What can we do with Ansible in a Kubernetes Operator? Because Ansible is now part of the Operator SDK, anything Operators could do should be able to be done with Ansible. It’s now possible to write an Operator as an Ansible Playbook or Role to manage components in Kubernetes clusters. In this blog, we’re going to be diving into an example Operator.

For more information on Kubernetes Operators with Ansible please refer to the following resources:

Building a Red Hat OpenShift cluster in AWS for Galera

First, we create our cluster. If you want to get up and running fast, using something like MiniShift or MiniKube works great. However, we’re going to be performing a load test against our Galera cluster later, so we’ll need a cluster that can scale beyond a single machine.

If you are an existing Red Hat customer, another option is spinning up an OpenShift cluster through cloud.redhat.com. This SaaS portal makes trying OpenShift a turnkey operation.

For out testing later, we’ll create an OpenShift OKD cluster on AWS with 5 compute nodes. The installation will have a few steps:

  1. Deploy the environment with a CloudFormation template.
  2. Run some pre-installation tasks, which will help ensure the environment can run the OpenShift installer without any problems.
  3. Run the OpenShift installation playbooks.
  4. Run a custom post-installation playbook, which configures a few extra things for us and tweaks the running OpenShift cluster to prepare it for our deployment.

If you’re not familiar with the OpenShift installation process, you can go here to learn more: learn.openshift.com

All of the custom playbooks and other files we’re going to use can be found here: github.com/water-hole/galera-ansible-operator

Deploying the environment

The files to do this are all in the deploy/aws directory of the above repository, along with a couple of helper scripts.

We start by creating an environment in AWS using the provided CloudFormation template. I choose to use CloudFormation here to simplify the process for those who may not be familiar with AWS environments. Before starting, you will need an AWS account as well as the Python boto3 library installed. Follow boto3’s instructions for installation.

Once that’s done, we can start deploying the environment with Ansible:


  $ ansible-playbook -vv deploy_cloudformation.yml 
  ...
  TASK [create a cloudformation stack] *********************************
  task path: /home/jamesc/aws_openshift/deploy_cloudformation.yml:4
  changed: [localhost] => {"changed": true, "events": ... }
  
  PLAY RECAP ***********************************************************
  localhost                  : ok=1    changed=1    unreachable=0    failed=0   

Pre-installation tasks

For our inventory, rather than use the flat format that you may normally see with OpenShift installs, we’re going to do things a bit more dynamically. In the aws directory, we have an inventory/ directory with two files in it:


  $ ls -l inventory/
  total 8
  -rw-rw-r--. 1 jamesc jamesc  430 Oct 23 13:29 origin.aws_ec2.yml
  -rw-rw-r--. 1 jamesc jamesc 1036 Oct 23 11:58 origin_inventory

The origin_inventory file is mostly what you’d see from a standard OpenShift install:


  $ cat inventory/origin_inventory 
  [OSEv3:children]
  masters
  nodes
  etcd
  glusterfs
  
  [OSEv3:vars]
  ansible_ssh_user=centos
  ansible_become=true
  
  openshift_deployment_type=origin
  openshift_disable_check=memory_availability,disk_availability
  openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
  
  openshift_storage_glusterfs_namespace=app-storage
  openshift_storage_glusterfs_storageclass=true
  openshift_storage_glusterfs_storageclass_default=true
  openshift_storage_glusterfs_block_deploy=true
  openshift_storage_glusterfs_block_host_vol_size=100
  openshift_storage_glusterfs_block_storageclass=true
  openshift_storage_glusterfs_block_storageclass_default=false

We pull the inventory from AWS and dynamically create the groups via the aws_ec2 inventory plugin:


  $ cat inventory/origin.aws_ec2.yml 
  plugin: aws_ec2
  regions:
    - us-east-2
  filters:
    tag:aws:cloudformation:stack-name: "jimi-openshift"
  strict: false
  keyed_groups:
    - key: tags.Application
      separator: ''
      values: "tags.Application|split(',')|list"
  compose:
    openshift_node_group_name: "'node-config-master-infra' if 'masters' in tags.Application else 'node-config-compute'"
    glusterfs_devices: "['/dev/xvdb'] if 'glusterfs' in tags.Application else None"

If you are new to Ansible, or if you would like to learn more about inventory plugins, you can read about them in our docs.  

The magic happens in the keyed_groups lines, which uses the tag named Application to create the groups. For example, we tag the master node with the value ‘[“masters”, “etcd”, “nodes”]’ and the values line for the keyed group turns that list into each group. This means we never have to touch our inventory files for this example, unless we want to change how we’re deploying OpenShift.

Once CloudFormation has finished provisioning, we can use the helper playbook named prepare.yml to configure the CentOS images the way OpenShift is going to expect them. This uses the combined AWS and flat inventory files found in the inventory/ directory:


  $ ansible-playbook -vv -i inventory/ --private-key=jimi.pem prepare.yaml 
  ...
  PLAY RECAP ********************************************
  ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com : ok=4    changed=4    unreachable=0    failed=0   
  ...
  ec2-xx-yy-zz-8.us-east-2.compute.amazonaws.com : ok=4    changed=4    unreachable=0    failed=0

Clone the OpenShift Ansible installer somewhere on your system and checkout the 3.11 tag:


$ git clone https://github.com/openshift/openshift-ansible.git
Cloning into 'openshift-ansible'...
...
$ cd openshift-ansible/
$ git checkout openshift-ansible-3.11.115-1

Running the OpenShift Installer

Now that the environment is ready, we can run the OpenShift OKD playbooks.

First, we run the prerequisites.yml playbook, which configures each system in the cluster based on the OKD inventory above:


  $ ansible-playbook -vv -i inventory/ --private-key=jimi.pem /path/to/openshift-ansible/playbooks/prerequisites.yml 
  ...
  PLAY RECAP ********************************************
  ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com : ok=68   changed=22   unreachable=0    failed=0   
  ec2-xx-yy-zz-2.us-east-2.compute.amazonaws.com : ok=49   changed=21   unreachable=0    failed=0   
  ...
  ec2-xx-yy-zz-8.us-east-2.compute.amazonaws.com : ok=49   changed=21   unreachable=0    failed=0   
  localhost                  : ok=12   changed=0    unreachable=0    failed=0   
  
  INSTALLER STATUS **************************************
  Initialization  : Complete (0:01:39)

Once the prerequisites are done, we can run the full installation:


  $ ansible-playbook -vv -i inventory/ --private-key=jimi.pem /path/to/openshift-ansible/playbooks/deploy_cluster.yml
  ...
  PLAY RECAP ********************************************
  ec2-xx-yy-zz-0.us-east-2.compute.amazonaws.com : ok=0    changed=0    unreachable=0    failed=0   
  ...
  ec2-xx-yy-zz-8.us-east-2.compute.amazonaws.com : ok=113  changed=63   unreachable=0    failed=0   
  localhost                  : ok=12   changed=0    unreachable=0    failed=0   
  
  INSTALLER STATUS **************************************
  Initialization               : Complete (0:01:12)
  Health Check                 : Complete (0:00:28)
  Node Bootstrap Preparation   : Complete (0:05:48)
  etcd Install                 : Complete (0:01:15)
  Master Install               : Complete (0:06:34)
  Master Additional Install    : Complete (0:01:02)
  Node Join                    : Complete (0:00:56)
  GlusterFS Install            : Complete (0:04:34)
  Hosted Install               : Complete (0:01:02)
  Cluster Monitoring Operator  : Complete (0:00:51)
  Web Console Install          : Complete (0:00:29)
  Console Install              : Complete (0:00:27)
  metrics-server Install       : Complete (0:00:01)
  Service Catalog Install      : Complete (0:02:12)

After about 30 minutes, you will have a fully functioning OpenShift cluster running in our provisioned AWS environment.

Post-installation tasks

Finally, we use the post_install.yml playbook, which adds an admin user to the OpenShift cluster so we can use the CLI (or GUI) to deploy our Operators (this only affects the master node):


  $ ansible-playbook -vv -i inventory/ --private-key=jimi.pem post_install.yml
  ...
  PLAY RECAP ********************************************
  ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com : ok=3    changed=3    unreachable=0    failed=0   

Our CloudFormation template also creates a test system (openshift-tester), which we’ll use to run our MySQL/MariaDB benchmarks in the third part of this series. All we do here for now is to install the epel-release package, followed by the installation of sysbench and the mariadb package so we can use the mysql CLI utility.

Verifying the Cluster

Once the cluster is up, we can use the oc utility to view the cluster nodes.


  $ oc login https://ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com:8443
  Authentication required for https://ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com:8443 (openshift)
  Username: admin
  Password: 
  Login successful.

You have access to the following projects and can switch between them with ‘oc project ‘:


      app-storage
    * default
      kube-public
      kube-service-catalog
      kube-system
      management-infra
      openshift
      openshift-ansible-service-broker
      openshift-console
      openshift-infra
      openshift-logging
      openshift-monitoring
      openshift-node
      openshift-sdn
      openshift-template-service-broker
      Openshift-web-console

Using project “default”.


  $ oc get nodes
  NAME                                                 STATUS    AGE
  ec2-xx-yy-zz-1.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-2.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-3.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-4.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-5.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-6.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-7.us-east-2.compute.amazonaws.com       Ready     1d
  ec2-xx-yy-zz-8.us-east-2.compute.amazonaws.com       Ready     1d

We mentioned at the outset that this would be a five node cluster, however you may notice there are more listed here. This is due to the fact that there’s also the master (with etcd colocated), as well as three GlusterFS nodes for persistent storage.

Tweaking OpenShift security

OpenShift runs with a higher level of security than vanilla Kubernetes. One of the ways it does this is to use randomized uid’s and gid’s for the containers it starts. Unfortunately, this causes a problem with the stock MariaDB container image we’re using (purely for the ease of this demo). To make sure this example works on k8s as well as OpenShift, we’ll lower the restrictions on the restricted SecurityContextConstraint:


  $ oc export scc restricted > restricted
  $ cp restricted galera
  $ vi galera
  $ diff -bup restricted galera
  --- restricted	2018-10-23 16:47:16.796618725 -0500
  +++ galera	2018-10-23 16:49:18.892874167 -0500
  @@ -21,13 +21,9 @@ metadata:
     name: restricted
   priority: null
   readOnlyRootFilesystem: false
  -requiredDropCapabilities:
  -- KILL
  -- MKNOD
  -- SETUID
  -- SETGID
  +requiredDropCapabilities: []
   runAsUser:
  -  type: MustRunAsRange
  +  type: RunAsAny
   seLinuxContext:
     type: MustRunAs
   supplementalGroups:

NOTE: This is *NOT* recommended for a production system. The actual changes made above were done with oc edit scc restricted. The above output shows what changes were made.

Because we don’t specify any domain in the OpenShift OSEv3 vars, our domain name is the default: default.svc.cluster.local. To access the web console and docker console/registry, we can simply add an entry to our /etc/hosts file as follows: 


  xx.yy.zz.1 docker-registry-default.router.default.svc.cluster.local 
             registry-console-default.router.default.svc.cluster.local 
             console.router.default.svc.cluster.local

Where xx.yy.zz.1 is the public IP of our master server.

Summary

This concludes setting up an OpenShift cluster in AWS for our testing purposes. The cluster installation is complete, and we’re ready to deploy our Operator! Continue on to part 2 to get an overview of Operators, as well as a deep-dive into how Kubernetes Operators with Ansible work.

Learn More

For more information on Kubernetes Operators with Ansible please refer to the following resources:

Stay tuned for part 2 “Get your hands dirty building Kubernetes Operators with Ansible”

AnsibleFest 2019

Want to learn more about Kubernetes Operators? Want to learn how Ansible can drive automation inside Kubernetes? Join us at AnsibleFest Atlanta 2019 from Sept 24-16, 2019. The Ansible Automation and OpenShift teams have worked together to bring more cloud native content to AnsibleFest than ever before! Get your tickets today!

Originally posted on Ansible Blog
Author:

Deja una respuesta

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *