Creating Ansible-based Operators
This guide outlines Ansible support in the Operator SDK and walks Operator authors through examples building and running Ansible-based Operators with the operator-sdk
CLI tool that use Ansible playbooks and modules.
Ansible support in the Operator SDK
The Operator Framework is an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way. This framework includes the Operator SDK, which assists developers in bootstrapping and building an Operator based on their expertise without requiring knowledge of Kubernetes API complexities.
One of the Operator SDK options for generating an Operator project includes leveraging existing Ansible playbooks and modules to deploy Kubernetes resources as a unified application, without having to write any Go code.
Custom resource files
Operators use the Kubernetes extension mechanism, custom resource definitions (CRDs), so your custom resource (CR) looks and acts just like the built-in, native Kubernetes objects.
The CR file format is a Kubernetes resource file. The object has mandatory and optional fields:
Field | Description |
---|---|
| Version of the CR to be created. |
| Kind of the CR to be created. |
| Kubernetes-specific metadata to be created. |
| Key-value list of variables which are passed to Ansible. This field is empty by default. |
| Summarizes the current state of the object. For Ansible-based Operators, the |
| Kubernetes-specific annotations to be appended to the CR. |
The following list of CR annotations modify the behavior of the Operator:
Annotation | Description |
---|---|
| Specifies the reconciliation interval for the CR. This value is parsed using the standard Golang package |
Example Ansible-based Operator annotation
apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
annotations:
ansible.operator-sdk/reconcile-period: "30s"
watches.yaml
file
A group/version/kind (GVK) is a unique identifier for a Kubernetes API. The watches.yaml
file contains a list of mappings from custom resources (CRs), identified by its GVK, to an Ansible role or playbook. The Operator expects this mapping file in a predefined location at /opt/ansible/watches.yaml
.
Field | Description |
---|---|
| Group of CR to watch. |
| Version of CR to watch. |
| Kind of CR to watch |
| Path to the Ansible role added to the container. For example, if your |
| Path to the Ansible playbook added to the container. This playbook is expected to be a way to call roles. This field is mutually exclusive with the |
| The reconciliation interval, how often the role or playbook is run, for a given CR. |
| When set to |
Example watches.yaml
file
- version: v1alpha1 (1)
group: test1.example.com
kind: Test1
role: /opt/ansible/roles/Test1
- version: v1alpha1 (2)
group: test2.example.com
kind: Test2
playbook: /opt/ansible/playbook.yml
- version: v1alpha1 (3)
group: test3.example.com
kind: Test3
playbook: /opt/ansible/test3.yml
reconcilePeriod: 0
manageStatus: false
1 | Simple example mapping Test1 to the test1 role. |
2 | Simple example mapping Test2 to a playbook. |
3 | More complex example for the Test3 kind. Disables re-queuing and managing the CR status in the playbook. |
Advanced options
Advanced features can be enabled by adding them to your watches.yaml
file per GVK. They can go below the group
, version
, kind
and playbook
or role
fields.
Some features can be overridden per resource using an annotation on that CR. The options that can be overridden have the annotation specified below.
Feature | YAML key | Description | Annotation for override | Default value |
---|---|---|---|---|
Reconcile period |
| Time between reconcile runs for a particular CR. |
|
|
Manage status |
| Allows the Operator to manage the |
| |
Watch dependent resources |
| Allows the Operator to dynamically watch resources that are created by Ansible. |
| |
Watch cluster-scoped resources |
| Allows the Operator to watch cluster-scoped resources that are created by Ansible. |
| |
Max runner artifacts |
| Manages the number of artifact directories that Ansible Runner keeps in the Operator container for each individual resource. |
|
|
Example watches.yml
file with advanced options
- version: v1alpha1
group: app.example.com
kind: AppService
playbook: /opt/ansible/playbook.yml
maxRunnerArtifacts: 30
reconcilePeriod: 5s
manageStatus: False
watchDependentResources: False
Extra variables sent to Ansible
Extra variables can be sent to Ansible, which are then managed by the Operator. The spec
section of the custom resource (CR) passes along the key-value pairs as extra variables. This is equivalent to extra variables passed in to the ansible-playbook
command.
The Operator also passes along additional variables under the meta
field for the name of the CR and the namespace of the CR.
For the following CR example:
apiVersion: "app.example.com/v1alpha1"
kind: "Database"
metadata:
name: "example"
spec:
message: "Hello world 2"
newParameter: "newParam"
The structure passed to Ansible as extra variables is:
{ "meta": {
"name": "<cr_name>",
"namespace": "<cr_namespace>",
},
"message": "Hello world 2",
"new_parameter": "newParam",
"_app_example_com_database": {
<full_crd>
},
}
The message
and newParameter
fields are set in the top level as extra variables, and meta
provides the relevant metadata for the CR as defined in the Operator. The meta
fields can be accessed using dot notation in Ansible, for example:
- debug:
msg: "name: {{ meta.name }}, {{ meta.namespace }}"
Ansible Runner directory
Ansible Runner keeps information about Ansible runs in the container. This is located at /tmp/ansible-operator/runner/<group>/<version>/<kind>/<namespace>/<name>
.
Additional resources
- To learn more about the
runner
directory, see the Ansible Runner documentation.
Building an Ansible-based Operator using the Operator SDK
This procedure walks through an example of building a simple Memcached Operator powered by Ansible playbooks and modules using tools and libraries provided by the Operator SDK.
Prerequisites
Operator SDK v0.19.4 CLI installed on the development workstation
Access to a Kubernetes-based cluster v1.11.3+ (for example OKD 4.6) using an account with
cluster-admin
permissionsOpenShift CLI (
oc
) v4.6+ installedansible
v2.9.0+ansible-runner
v1.1.0+ansible-runner-http
v1.0.0+
Procedure
Create a new Operator project. A namespace-scoped Operator watches and manages resources in a single namespace. Namespace-scoped Operators are preferred because of their flexibility. They enable decoupled upgrades, namespace isolation for failures and monitoring, and differing API definitions.
To create a new Ansible-based, namespace-scoped
memcached-operator
project and change to the new directory, use the following commands:$ operator-sdk new memcached-operator \
--api-version=cache.example.com/v1alpha1 \
--kind=Memcached \
--type=ansible
$ cd memcached-operator
This creates the
memcached-operator
project specifically for watching theMemcached
resource with API versionexample.com/v1apha1
and kindMemcached
.Customize the Operator logic.
For this example, the
memcached-operator
executes the following reconciliation logic for eachMemcached
custom resource (CR):Create a
memcached
deployment if it does not exist.Ensure that the deployment size is the same as specified by the
Memcached
CR.
By default, the
memcached-operator
watchesMemcached
resource events as shown in thewatches.yaml
file and executes the Ansible roleMemcached
:- version: v1alpha1
group: cache.example.com
kind: Memcached
You can optionally customize the following logic in the
watches.yaml
file:Specifying a
role
option configures the Operator to use this specified path when launchingansible-runner
with an Ansible role. By default, theoperator-sdk new
command fills in an absolute path to where your role should go:- version: v1alpha1
group: cache.example.com
kind: Memcached
role: /opt/ansible/roles/memcached
Specifying a
playbook
option in thewatches.yaml
file configures the Operator to use this specified path when launchingansible-runner
with an Ansible playbook:- version: v1alpha1
group: cache.example.com
kind: Memcached
playbook: /opt/ansible/playbook.yaml
Build the Memcached Ansible role.
Modify the generated Ansible role under the
roles/memcached/
directory. This Ansible role controls the logic that is executed when a resource is modified.Define the
Memcached
spec.Defining the spec for an Ansible-based Operator can be done entirely in Ansible. The Ansible Operator passes all key-value pairs listed in the CR spec field along to Ansible as variables. The names of all variables in the spec field are converted to snake case (lowercase with an underscore) by the Operator before running Ansible. For example,
serviceAccount
in the spec becomesservice_account
in Ansible.You should perform some type validation in Ansible on the variables to ensure that your application is receiving expected input.
In case the user does not set the
spec
field, set a default by modifying theroles/memcached/defaults/main.yml
file:size: 1
Define the
Memcached
deployment.With the
Memcached
spec now defined, you can define what Ansible is actually executed on resource changes. Because this is an Ansible role, the default behavior executes the tasks in theroles/memcached/tasks/main.yml
file.The goal is for Ansible to create a deployment if it does not exist, which runs the
memcached:1.4.36-alpine
image. Ansible 2.7+ supports the k8s Ansible module, which this example leverages to control the deployment definition.Modify the
roles/memcached/tasks/main.yml
to match the following:- name: start memcached
k8s:
definition:
kind: Deployment
apiVersion: apps/v1
metadata:
name: '{{ meta.name }}-memcached'
namespace: '{{ meta.namespace }}'
spec:
replicas: "{{size}}"
selector:
matchLabels:
app: memcached
template:
metadata:
labels:
app: memcached
spec:
containers:
- name: memcached
command:
- memcached
- -m=64
- -o
- modern
- -v
image: "docker.io/memcached:1.4.36-alpine"
ports:
- containerPort: 11211
This example used the
size
variable to control the number of replicas of theMemcached
deployment. This example sets the default to1
, but any user can create a CR that overwrites the default.
Deploy the CRD.
Before running the Operator, Kubernetes needs to know about the new custom resource definition (CRD) that the Operator will be watching. Deploy the
Memcached
CRD:$ oc create -f deploy/crds/cache.example.com_memcacheds_crd.yaml
Build and run the Operator.
There are two ways to build and run the Operator:
As a pod inside a Kubernetes cluster.
As a Go program outside the cluster using the
operator-sdk up
command.
Choose one of the following methods:
Run as a pod inside a Kubernetes cluster. This is the preferred method for production use.
Build the
memcached-operator
image and push it to a registry:$ operator-sdk build quay.io/example/memcached-operator:v0.0.1
$ podman push quay.io/example/memcached-operator:v0.0.1
Deployment manifests are generated in the
deploy/operator.yaml
file. The deployment image in this file needs to be modified from the placeholderREPLACE_IMAGE
to the previous built image. To do this, run:$ sed -i 's|REPLACE_IMAGE|quay.io/example/memcached-operator:v0.0.1|g' deploy/operator.yaml
Deploy the
memcached-operator
manifests:$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc create -f deploy/operator.yaml
Verify that the
memcached-operator
deployment is up and running:$ oc get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
memcached-operator 1 1 1 1 1m
Run outside the cluster. This method is preferred during the development cycle to speed up deployment and testing.
Ensure that Ansible Runner and Ansible Runner HTTP Plug-in are installed or else you will see unexpected errors from Ansible Runner when a CR is created.
It is also important that the role path referenced in the
watches.yaml
file exists on your machine. Because normally a container is used where the role is put on disk, the role must be manually copied to the configured Ansible roles path (for example/etc/ansible/roles
).To run the Operator locally with the default Kubernetes configuration file present at
$HOME/.kube/config
:$ operator-sdk run --local
To run the Operator locally with a provided Kubernetes configuration file:
$ operator-sdk run --local --kubeconfig=config
Create a
Memcached
CR.Modify the
deploy/crds/cache_v1alpha1_memcached_cr.yaml
file as shown and create aMemcached
CR:$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
Example output
apiVersion: "cache.example.com/v1alpha1"
kind: "Memcached"
metadata:
name: "example-memcached"
spec:
size: 3
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Ensure that the
memcached-operator
creates the deployment for the CR:$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
memcached-operator 1 1 1 1 2m
example-memcached 3 3 3 3 1m
Check the pods to confirm three replicas were created:
$ oc get pods
NAME READY STATUS RESTARTS AGE
example-memcached-6fd7c98d8-7dqdr 1/1 Running 0 1m
example-memcached-6fd7c98d8-g5k7v 1/1 Running 0 1m
example-memcached-6fd7c98d8-m7vn7 1/1 Running 0 1m
memcached-operator-7cc7cfdf86-vvjqk 1/1 Running 0 2m
Update the size.
Change the
spec.size
field in thememcached
CR from3
to4
and apply the change:$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
Example output
apiVersion: "cache.example.com/v1alpha1"
kind: "Memcached"
metadata:
name: "example-memcached"
spec:
size: 4
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Confirm that the Operator changes the deployment size:
$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
example-memcached 4 4 4 4 5m
Clean up the resources:
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
$ oc delete -f deploy/operator.yaml
$ oc delete -f deploy/role_binding.yaml
$ oc delete -f deploy/role.yaml
$ oc delete -f deploy/service_account.yaml
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_crd.yaml
Managing application lifecycle using the k8s Ansible module
To manage the lifecycle of your application on Kubernetes using Ansible, you can use the k8s
Ansible module. This Ansible module allows a developer to either leverage their existing Kubernetes resource files (written in YAML) or express the lifecycle management in native Ansible.
One of the biggest benefits of using Ansible in conjunction with existing Kubernetes resource files is the ability to use Jinja templating so that you can customize resources with the simplicity of a few variables in Ansible.
This section goes into detail on usage of the k8s
Ansible module. To get started, install the module on your local workstation and test it using a playbook before moving on to using it within an Operator.
Installing the k8s Ansible module
To install the k8s
Ansible module on your local workstation:
Procedure
Install Ansible 2.9+:
$ sudo yum install ansible
Install the OpenShift python client package using
pip
:$ sudo pip install openshift
$ sudo pip install kubernetes
Testing the k8s Ansible module locally
Sometimes, it is beneficial for a developer to run the Ansible code from their local machine as opposed to running and rebuilding the Operator each time.
Procedure
Install the
community.kubernetes
collection:$ ansible-galaxy collection install community.kubernetes
Initialize a new Ansible-based Operator project:
$ operator-sdk new --type ansible \
--kind Test1 \
--api-version test1.example.com/v1alpha1 test1-operator
Example output
Create test1-operator/tmp/init/galaxy-init.sh
Create test1-operator/tmp/build/Dockerfile
Create test1-operator/tmp/build/test-framework/Dockerfile
Create test1-operator/tmp/build/go-test.sh
Rendering Ansible Galaxy role [test1-operator/roles/test1]...
Cleaning up test1-operator/tmp/init
Create test1-operator/watches.yaml
Create test1-operator/deploy/rbac.yaml
Create test1-operator/deploy/crd.yaml
Create test1-operator/deploy/cr.yaml
Create test1-operator/deploy/operator.yaml
Run git init ...
Initialized empty Git repository in /home/user/go/src/github.com/user/opsdk/test1-operator/.git/
Run git init done
$ cd test1-operator
Modify the
roles/test1/tasks/main.yml
file with the Ansible logic that you want. This example creates and deletes a namespace with the switch of a variable.- name: set test namespace to "{{ state }}"
community.kubernetes.k8s:
api_version: v1
kind: Namespace
state: "{{ state }}"
name: test
ignore_errors: true (1)
1 Setting ignore_errors: true
ensures that deleting a nonexistent project does not fail.Modify the
roles/test1/defaults/main.yml
file to setstate
topresent
by default:state: present
Create an Ansible playbook
playbook.yml
in the top-level directory, which includes thetest1
role:- hosts: localhost
roles:
- test1
Run the playbook:
$ ansible-playbook playbook.yml
Example output
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [localhost] ***************************************************************************
PROCEDURE [Gathering Facts] *********************************************************************
ok: [localhost]
Task [test1 : set test namespace to present]
changed: [localhost]
PLAY RECAP *********************************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0
Check that the namespace was created:
$ oc get namespace
Example output
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
test Active 3s
Rerun the playbook setting
state
toabsent
:$ ansible-playbook playbook.yml --extra-vars state=absent
Example output
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [localhost] ***************************************************************************
PROCEDURE [Gathering Facts] *********************************************************************
ok: [localhost]
Task [test1 : set test namespace to absent]
changed: [localhost]
PLAY RECAP *********************************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0
Check that the namespace was deleted:
$ oc get namespace
Example output
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
Testing the k8s Ansible module inside an Operator
After you are familiar with using the k8s
Ansible module locally, you can trigger the same Ansible logic inside of an Operator when a custom resource (CR) changes. This example maps an Ansible role to a specific Kubernetes resource that the Operator watches. This mapping is done in the watches.yaml
file.
Testing an Ansible-based Operator locally
After getting comfortable testing Ansible workflows locally, you can test the logic inside of an Ansible-based Operator running locally.
To do so, use the operator-sdk run --local
command from the top-level directory of your Operator project. This command reads from the watches.yaml
file and uses the ~/.kube/config
file to communicate with a Kubernetes cluster just as the k8s
Ansible module does.
Procedure
Because the
run --local
command reads from thewatches.yaml
file, there are options available to the Operator author. Ifrole
is left alone (by default,/opt/ansible/roles/<name>
) you must copy the role over to the/opt/ansible/roles/
directory from the Operator directly.This is cumbersome because changes are not reflected from the current directory. Instead, change the
role
field to point to the current directory and comment out the existing line:- version: v1alpha1
group: test1.example.com
kind: Test1
# role: /opt/ansible/roles/Test1
role: /home/user/test1-operator/Test1
Create a custom resource definition (CRD) and proper role-based access control (RBAC) definitions for the custom resource (CR)
Test1
. Theoperator-sdk
command autogenerates these files inside of thedeploy/
directory:$ oc create -f deploy/crds/test1_v1alpha1_test1_crd.yaml
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
Run the
run --local
command:$ operator-sdk run --local
Example output
[...]
INFO[0000] Starting to serve on 127.0.0.1:8888
INFO[0000] Watching test1.example.com/v1alpha1, Test1, default
Now that the Operator is watching the resource
Test1
for events, the creation of a CR triggers your Ansible role to execute. View thedeploy/cr.yaml
file:apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
Because the
spec
field is not set, Ansible is invoked with no extra variables. The next section covers how extra variables are passed from a CR to Ansible. This is why it is important to set reasonable defaults for the Operator.Create a CR instance of
Test1
with the default variablestate
set topresent
:$ oc create -f deploy/cr.yaml
Check that the namespace
test
was created:$ oc get namespace
Example output
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
test Active 3s
Modify the
deploy/cr.yaml
file to set thestate
field toabsent
:apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
spec:
state: "absent"
Apply the changes and confirm that the namespace is deleted:
$ oc apply -f deploy/cr.yaml
$ oc get namespace
Example output
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
Testing an Ansible-based Operator on a cluster
After getting familiar running Ansible logic inside of an Ansible-based Operator locally, you can test the Operator inside of a pod on a Kubernetes cluster, such as OKD. Running as a pod on a cluster is preferred for production use.
Procedure
Build the
test1-operator
image and push it to a registry:$ operator-sdk build quay.io/example/test1-operator:v0.0.1
$ podman push quay.io/example/test1-operator:v0.0.1
Deployment manifests are generated in the
deploy/operator.yaml
file. The deployment image in this file must be modified from the placeholderREPLACE_IMAGE
to the previously-built image. To do so, run the following command:$ sed -i 's|REPLACE_IMAGE|quay.io/example/test1-operator:v0.0.1|g' deploy/operator.yaml
If you are performing these steps on macOS, use the following command instead:
$ sed -i "" 's|REPLACE_IMAGE|quay.io/example/test1-operator:v0.0.1|g' deploy/operator.yaml
Deploy the
test1-operator
:$ oc create -f deploy/crds/test1_v1alpha1_test1_crd.yaml (1)
1 Only required if the CRD does not exist already. $ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc create -f deploy/operator.yaml
Verify that the
test1-operator
is up and running:$ oc get deployment
Example output
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
test1-operator 1 1 1 1 1m
You can now view the Ansible logs for the
test1-operator
:$ oc logs deployment/test1-operator
Managing custom resource status using the operator_sdk.util
Ansible collection
Ansible-based Operators automatically update custom resource (CR) status
subresources with generic information about the previous Ansible run. This includes the number of successful and failed tasks and relevant error messages as shown:
status:
conditions:
- ansibleResult:
changed: 3
completion: 2018-12-03T13:45:57.13329
failures: 1
ok: 6
skipped: 0
lastTransitionTime: 2018-12-03T13:45:57Z
message: 'Status code was -1 and not [200]: Request failed: <urlopen error [Errno
113] No route to host>'
reason: Failed
status: "True"
type: Failure
- lastTransitionTime: 2018-12-03T13:46:13Z
message: Running reconciliation
reason: Running
status: "True"
type: Running
Ansible-based Operators also allow Operator authors to supply custom status values with the k8s_status
Ansible module, which is included in the operator_sdk.util
collection. This allows the author to update the status
from within Ansible with any key-value pair as desired.
By default, Ansible-based Operators always include the generic Ansible run output as shown above. If you would prefer your application did not update the status with Ansible output, you can track the status manually from your application.
Procedure
To track CR status manually from your application, update the
watches.yaml
file with amanageStatus
field set tofalse
:- version: v1
group: api.example.com
kind: Test1
role: Test1
manageStatus: false
Use the
operator_sdk.util.k8s_status
Ansible module to update the subresource. For example, to update with keytest1
and valuetest2
,operator_sdk.util
can be used as shown:- operator_sdk.util.k8s_status:
api_version: app.example.com/v1
kind: Test1
name: "{{ meta.name }}"
namespace: "{{ meta.namespace }}"
status:
test1: test2
Collections can also be declared in the
meta/main.yml
for the role, which is included for new scaffolded Ansible Operators:collections:
- operator_sdk.util
Declaring collections in the role meta allows you to invoke the
k8s_status
module directly:k8s_status:
<snip>
status:
test1: test2
Additional resources
- For more details about user-driven status management from Ansible-based Operators, see the Ansible-based Operator Status Proposal for Operator SDK.
Additional resources
See Appendices to learn about the project directory structures created by the Operator SDK.
Reaching for the Stars with Ansible Operator - Red Hat OpenShift Blog