Operator SDK tutorial for Ansible-based Operators

Operator SDK tutorial for Ansible-based Operators

Operator developers can take advantage of Ansible support in the Operator SDK to build an example Ansible-based Operator for Memcached, a distributed key-value store, and manage its lifecycle. This tutorial walks through the following process:

Create a Memcached deployment
Ensure that the deployment size is the same as specified by the Memcached custom resource (CR) spec
Update the Memcached CR status using the status writer with the names of the memcached pods

This process is accomplished by using two centerpieces of the Operator Framework:

Operator SDK

The operator-sdk CLI tool and controller-runtime library API

Operator Lifecycle Manager (OLM)

Installation, upgrade, and role-based access control (RBAC) of Operators on a cluster

This tutorial goes into greater detail than Getting started with Operator SDK for Ansible-based Operators.

Prerequisites

Operator SDK CLI installed
OpenShift CLI (oc) v4.11+ installed
Ansible v2.9.0
Ansible Runner v2.0.2+
Ansible Runner HTTP Event Emitter plug-in v1.0.0+
Python 3.8.6+
OpenShift Python client v0.12.0+
Logged into an OKD 4.11 cluster with oc with an account that has cluster-admin permissions
To allow the cluster to pull the image, the repository where you push your image must be set as public, or you must configure an image pull secret

Creating a project

Use the Operator SDK CLI to create a project called memcached-operator.

Procedure

Create a directory for the project:

$ mkdir -p $HOME/projects/memcached-operator

Change to the directory:
```
$ cd $HOME/projects/memcached-operator
```
Run the operator-sdk init command with the ansible plug-in to initialize the project:
```
$ operator-sdk init \
    --plugins=ansible \
    --domain=example.com
```

PROJECT file

Among the files generated by the operator-sdk init command is a Kubebuilder PROJECT file. Subsequent operator-sdk commands, as well as help output, that are run from the project root read this file and are aware that the project type is Ansible. For example:

domain: example.com
layout: ansible.sdk.operatorframework.io/v1
projectName: memcached-operator
version: 3

Creating an API

Use the Operator SDK CLI to create a Memcached API.

Procedure

Run the following command to create an API with group cache, version, v1, and kind Memcached:

$ operator-sdk create api \
    --group cache \
    --version v1 \
    --kind Memcached \
    --generate-role (1)

1	Generates an Ansible role for the API.

After creating the API, your Operator project updates with the following structure:

Memcached CRD

Includes a sample Memcached resource

Manager

Program that reconciles the state of the cluster to the desired state by using:

A reconciler, either an Ansible role or playbook
A watches.yaml file, which connects the Memcached resource to the memcached Ansible role

Modifying the manager

Update your Operator project to provide the reconcile logic, in the form of an Ansible role, which runs every time a Memcached resource is created, updated, or deleted.

Procedure

Update the roles/memcached/tasks/main.yml file with the following structure:

---
- name: start memcached
  community.kubernetes.k8s:
    definition:
      kind: Deployment
      apiVersion: apps/v1
      metadata:
        name: '{{ ansible_operator_meta.name }}-memcached'
        namespace: '{{ ansible_operator_meta.namespace }}'
      spec:
        replicas: "{{size}}"
        selector:
          matchLabels:
            app: memcached
        template:
          metadata:
            labels:
              app: memcached
          spec:
            containers:
            - name: memcached
              command:
              - memcached
              - -m=64
              - -o
              - modern
              - -v
              image: "docker.io/memcached:1.4.36-alpine"
              ports:
                - containerPort: 11211

This memcached role ensures a memcached deployment exist and sets the deployment size.

Set default values for variables used in your Ansible role by editing the roles/memcached/defaults/main.yml file:
```
---
# defaults file for Memcached
size: 1
```
Update the Memcached sample resource in the config/samples/cache_v1_memcached.yaml file with the following structure:
```
apiVersion: cache.example.com/v1
kind: Memcached
metadata:
  name: memcached-sample
spec:
  size: 3
```
The key-value pairs in the custom resource (CR) spec are passed to Ansible as extra variables.

The names of all variables in the spec field are converted to snake case, meaning lowercase with an underscore, by the Operator before running Ansible. For example, serviceAccount in the spec becomes service_account in Ansible.

You can disable this case conversion by setting the snakeCaseParameters option to false in your watches.yaml file. It is recommended that you perform some type validation in Ansible on the variables to ensure that your application is receiving expected input.

Enabling proxy support

Operator authors can develop Operators that support network proxies. Cluster administrators configure proxy support for the environment variables that are handled by Operator Lifecycle Manager (OLM). To support proxied clusters, your Operator must inspect the environment for the following standard proxy variables and pass the values to Operands:

HTTP_PROXY
HTTPS_PROXY
NO_PROXY

This tutorial uses HTTP_PROXY as an example environment variable.

Prerequisites

A cluster with cluster-wide egress proxy enabled.

Procedure

Add the environment variables to the deployment by updating the roles/memcached/tasks/main.yml file with the following:

...
env:
   - name: HTTP_PROXY
     value: '{{ lookup("env", "HTTP_PROXY") | default("", True) }}'
   - name: http_proxy
     value: '{{ lookup("env", "HTTP_PROXY") | default("", True) }}'
...

Set the environment variable on the Operator deployment by adding the following to the config/manager/manager.yaml file:

containers:
 - args:
   - --leader-elect
   - --leader-election-id=ansible-proxy-demo
   image: controller:latest
   name: manager
   env:
     - name: "HTTP_PROXY"
       value: "http_proxy_test"

Running the Operator

There are three ways you can use the Operator SDK CLI to build and run your Operator:

Run locally outside the cluster as a Go program.
Run as a deployment on the cluster.
Bundle your Operator and use Operator Lifecycle Manager (OLM) to deploy on the cluster.

Running locally outside the cluster

You can run your Operator project as a Go program outside of the cluster. This is useful for development purposes to speed up deployment and testing.

Procedure

Run the following command to install the custom resource definitions (CRDs) in the cluster configured in your ~/.kube/config file and run the Operator locally:

$ make install run

Example output

...
{"level":"info","ts":1612589622.7888272,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"cache.example.com","Options.Version":"v1","Options.Kind":"Memcached"}
{"level":"info","ts":1612589622.7897573,"logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"}
{"level":"info","ts":1612589622.789971,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1612589622.7899997,"logger":"controller-runtime.manager.controller.memcached-controller","msg":"Starting EventSource","source":"kind source: cache.example.com/v1, Kind=Memcached"}
{"level":"info","ts":1612589622.8904517,"logger":"controller-runtime.manager.controller.memcached-controller","msg":"Starting Controller"}
{"level":"info","ts":1612589622.8905244,"logger":"controller-runtime.manager.controller.memcached-controller","msg":"Starting workers","worker count":8}

Running as a deployment on the cluster

You can run your Operator project as a deployment on your cluster.

Procedure

Run the following make commands to build and push the Operator image. Modify the IMG argument in the following steps to reference a repository that you have access to. You can obtain an account for storing containers at repository sites such as Quay.io.

Build the image:

$ make docker-build IMG=<registry>/<user>/<image_name>:<tag>

The Dockerfile generated by the SDK for the Operator explicitly references GOARCH=amd64 for go build. This can be amended to GOARCH=$TARGETARCH for non-AMD64 architectures. Docker will automatically set the environment variable to the value specified by –platform. With Buildah, the –build-arg will need to be used for the purpose. For more information, see Multiple Architectures.

Push the image to a repository:

$ make docker-push IMG=<registry>/<user>/<image_name>:<tag>

The name and tag of the image, for example IMG=<registry>/<user>/<image_name>:<tag>, in both the commands can also be set in your Makefile. Modify the IMG ?= controller:latest value to set your default image name.

Run the following command to deploy the Operator:
```
$ make deploy IMG=<registry>/<user>/<image_name>:<tag>
```
By default, this command creates a namespace with the name of your Operator project in the form <project_name>-system and is used for the deployment. This command also installs the RBAC manifests from config/rbac.

Run the following command to verify that the Operator is running:

$ oc get deployment -n <project_name>-system

Example output

NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
<project_name>-controller-manager       1/1     1            1           8m

Bundling an Operator and deploying with Operator Lifecycle Manager

Bundling an Operator

The Operator bundle format is the default packaging method for Operator SDK and Operator Lifecycle Manager (OLM). You can get your Operator ready for use on OLM by using the Operator SDK to build and push your Operator project as a bundle image.

Prerequisites

Operator SDK CLI installed on a development workstation
OpenShift CLI (oc) v4.11+ installed
Operator project initialized by using the Operator SDK

Procedure

Run the following make commands in your Operator project directory to build and push your Operator image. Modify the IMG argument in the following steps to reference a repository that you have access to. You can obtain an account for storing containers at repository sites such as Quay.io.

Build the image:

$ make docker-build IMG=<registry>/<user>/<operator_image_name>:<tag>

Push the image to a repository:

$ make docker-push IMG=<registry>/<user>/<operator_image_name>:<tag>

Create your Operator bundle manifest by running the make bundle command, which invokes several commands, including the Operator SDK generate bundle and bundle validate subcommands:
```
$ make bundle IMG=<registry>/<user>/<operator_image_name>:<tag>
```
Bundle manifests for an Operator describe how to display, create, and manage an application. The make bundle command creates the following files and directories in your Operator project:
- A bundle manifests directory named bundle/manifests that contains a ClusterServiceVersion object
- A bundle metadata directory named bundle/metadata
- All custom resource definitions (CRDs) in a config/crd directory
- A Dockerfile bundle.Dockerfile
These files are then automatically validated by using operator-sdk bundle validate to ensure the on-disk bundle representation is correct.
Build and push your bundle image by running the following commands. OLM consumes Operator bundles using an index image, which reference one or more bundle images.
1. Build the bundle image. Set BUNDLE_IMG with the details for the registry, user namespace, and image tag where you intend to push the image:
```
$ make bundle-build BUNDLE_IMG=<registry>/<user>/<bundle_image_name>:<tag>
```
2. Push the bundle image:
```
$ docker push <registry>/<user>/<bundle_image_name>:<tag>
```

Deploying an Operator with Operator Lifecycle Manager

Operator Lifecycle Manager (OLM) helps you to install, update, and manage the lifecycle of Operators and their associated services on a Kubernetes cluster. OLM is installed by default on OKD and runs as a Kubernetes extension so that you can use the web console and the OpenShift CLI (oc) for all Operator lifecycle management functions without any additional tools.

The Operator bundle format is the default packaging method for Operator SDK and OLM. You can use the Operator SDK to quickly run a bundle image on OLM to ensure that it runs properly.

Prerequisites

Operator SDK CLI installed on a development workstation
Operator bundle image built and pushed to a registry
OLM installed on a Kubernetes-based cluster (v1.16.0 or later if you use apiextensions.k8s.io/v1 CRDs, for example OKD 4.11)
Logged in to the cluster with oc using an account with cluster-admin permissions

Procedure

Check the status of OLM on your cluster by using the following Operator SDK command:

$ operator-sdk olm status \
    --olm-namespace=openshift-operator-lifecycle-manager

Enter the following command to run the Operator on the cluster:

$ operator-sdk run bundle \(1)
    -n <namespace> \(2)
    <registry>/<user>/<bundle_image_name>:<tag> (3)

1	The `run bundle` command creates a valid file-based catalog and installs the Operator bundle on your cluster using OLM.
2	Optional: By default, the command installs the Operator in the currently active project in your `~/.kube/config` file. You can add the `-n` flag to set a different namespace scope for the installation.
3	If you do not specify an image, the command uses `quay.io/operator-framework/opm:latest` as the default index image. If you specify an image, the command uses the bundle image itself as the index image.

As of OKD 4.11, the run bundle command supports the file-based catalog format for Operator catalogs by default. The deprecated SQLite database format for Operator catalogs continues to be supported; however, it will be removed in a future release. It is recommended that Operator authors migrate their workflows to the file-based catalog format.

This command performs the following actions:

Create an index image referencing your bundle image. The index image is opaque and ephemeral, but accurately reflects how a bundle would be added to a catalog in production.
Create a catalog source that points to your new index image, which enables OperatorHub to discover your Operator.
Deploy your Operator to your cluster by creating an OperatorGroup, Subscription, InstallPlan, and all other required resources, including RBAC.

Creating a custom resource

After your Operator is installed, you can test it by creating a custom resource (CR) that is now provided on the cluster by the Operator.

Prerequisites

Example Memcached Operator, which provides the Memcached CR, installed on a cluster

Procedure

Change to the namespace where your Operator is installed. For example, if you deployed the Operator using the make deploy command:
```
$ oc project memcached-operator-system
```
Edit the sample Memcached CR manifest at config/samples/cache_v1_memcached.yaml to contain the following specification:
```
apiVersion: cache.example.com/v1
kind: Memcached
metadata:
  name: memcached-sample
...
spec:
...
  size: 3
```

Create the CR:

$ oc apply -f config/samples/cache_v1_memcached.yaml

Ensure that the Memcached Operator creates the deployment for the sample CR with the correct size:

$ oc get deployments

Example output

NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-operator-controller-manager   1/1     1            1           8m
memcached-sample                        3/3     3            3           1m

Check the pods and CR status to confirm the status is updated with the Memcached pod names.

Check the pods:

$ oc get pods

Example output

NAME                                  READY     STATUS    RESTARTS   AGE
memcached-sample-6fd7c98d8-7dqdr      1/1       Running   0          1m
memcached-sample-6fd7c98d8-g5k7v      1/1       Running   0          1m
memcached-sample-6fd7c98d8-m7vn7      1/1       Running   0          1m

Check the CR status:

$ oc get memcached/memcached-sample -o yaml

Example output

apiVersion: cache.example.com/v1
kind: Memcached
metadata:
...
  name: memcached-sample
...
spec:
  size: 3
status:
  nodes:
  - memcached-sample-6fd7c98d8-7dqdr
  - memcached-sample-6fd7c98d8-g5k7v
  - memcached-sample-6fd7c98d8-m7vn7

Update the deployment size.

Update config/samples/cache_v1_memcached.yaml file to change the spec.size field in the Memcached CR from 3 to 5:
```
$ oc patch memcached memcached-sample \
    -p '{"spec":{"size": 5}}' \
    --type=merge
```

Confirm that the Operator changes the deployment size:

$ oc get deployments

Example output

NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
memcached-operator-controller-manager   1/1     1            1           10m
memcached-sample                        5/5     5            5           3m

Clean up the resources that have been created as part of this tutorial.
- If you used the make deploy command to test the Operator, run the following command:
```
$ make undeploy
```
- If you used the operator-sdk run bundle command to test the Operator, run the following command:
```
$ operator-sdk cleanup <project_name>
```

Additional resources

See Project layout for Ansible-based Operators to learn about the directory structures created by the Operator SDK.
If a cluster-wide egress proxy is configured, cluster administrators can override the proxy settings or inject a custom CA certificate for specific Operators running on Operator Lifecycle Manager (OLM).

Tutorial

Operator SDK tutorial for Ansible-based Operators

Prerequisites

Creating a project

PROJECT file

Creating an API

Modifying the manager

Enabling proxy support

Running the Operator

Running locally outside the cluster

Running as a deployment on the cluster

Bundling an Operator and deploying with Operator Lifecycle Manager

Bundling an Operator

Deploying an Operator with Operator Lifecycle Manager

Creating a custom resource

Additional resources