apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
annotations:
ansible.operator-sdk/reconcile-period: "30s"
This guide outlines Ansible support in the Operator SDK and walks Operator authors through examples building and running Ansible-based Operators with the operator-sdk
CLI tool that use Ansible playbooks and modules.
The Operator Framework is an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way. This framework includes the Operator SDK, which assists developers in bootstrapping and building an Operator based on their expertise without requiring knowledge of Kubernetes API complexities.
One of the Operator SDK options for generating an Operator project includes leveraging existing Ansible playbooks and modules to deploy Kubernetes resources as a unified application, without having to write any Go code.
Operators use the Kubernetes extension mechanism, custom resource definitions (CRDs), so your custom resource (CR) looks and acts just like the built-in, native Kubernetes objects.
The CR file format is a Kubernetes resource file. The object has mandatory and optional fields:
Field | Description |
---|---|
|
Version of the CR to be created. |
|
Kind of the CR to be created. |
|
Kubernetes-specific metadata to be created. |
|
Key-value list of variables which are passed to Ansible. This field is empty by default. |
|
Summarizes the current state of the object. For Ansible-based Operators, the |
|
Kubernetes-specific annotations to be appended to the CR. |
The following list of CR annotations modify the behavior of the Operator:
Annotation | Description |
---|---|
|
Specifies the reconciliation interval for the CR. This value is parsed using the standard Golang package |
apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
annotations:
ansible.operator-sdk/reconcile-period: "30s"
watches.yaml
fileA group/version/kind (GVK) is a unique identifier for a Kubernetes API. The watches.yaml
file contains a list of mappings from custom resources (CRs), identified by its GVK, to an Ansible role or playbook. The Operator expects this mapping file in a predefined location at /opt/ansible/watches.yaml
.
Field | Description |
---|---|
|
Group of CR to watch. |
|
Version of CR to watch. |
|
Kind of CR to watch |
|
Path to the Ansible role added to the container. For example, if your |
|
Path to the Ansible playbook added to the container. This playbook is expected to be a way to call roles. This field is mutually exclusive with the |
|
The reconciliation interval, how often the role or playbook is run, for a given CR. |
|
When set to |
watches.yaml
file- version: v1alpha1 (1)
group: test1.example.com
kind: Test1
role: /opt/ansible/roles/Test1
- version: v1alpha1 (2)
group: test2.example.com
kind: Test2
playbook: /opt/ansible/playbook.yml
- version: v1alpha1 (3)
group: test3.example.com
kind: Test3
playbook: /opt/ansible/test3.yml
reconcilePeriod: 0
manageStatus: false
1 | Simple example mapping Test1 to the test1 role. |
2 | Simple example mapping Test2 to a playbook. |
3 | More complex example for the Test3 kind. Disables re-queuing and managing the CR status in the playbook. |
Advanced features can be enabled by adding them to your watches.yaml
file per GVK. They can go below the group
, version
, kind
and playbook
or role
fields.
Some features can be overridden per resource using an annotation on that CR. The options that can be overridden have the annotation specified below.
Feature | YAML key | Description | Annotation for override | Default value |
---|---|---|---|---|
Reconcile period |
|
Time between reconcile runs for a particular CR. |
|
|
Manage status |
|
Allows the Operator to manage the |
|
|
Watch dependent resources |
|
Allows the Operator to dynamically watch resources that are created by Ansible. |
|
|
Watch cluster-scoped resources |
|
Allows the Operator to watch cluster-scoped resources that are created by Ansible. |
|
|
Max runner artifacts |
|
Manages the number of artifact directories that Ansible Runner keeps in the Operator container for each individual resource. |
|
|
watches.yml
file with advanced options- version: v1alpha1
group: app.example.com
kind: AppService
playbook: /opt/ansible/playbook.yml
maxRunnerArtifacts: 30
reconcilePeriod: 5s
manageStatus: False
watchDependentResources: False
Extra variables can be sent to Ansible, which are then managed by the Operator. The spec
section of the custom resource (CR) passes along the key-value pairs as extra variables. This is equivalent to extra variables passed in to the ansible-playbook
command.
The Operator also passes along additional variables under the meta
field for the name of the CR and the namespace of the CR.
For the following CR example:
apiVersion: "app.example.com/v1alpha1"
kind: "Database"
metadata:
name: "example"
spec:
message: "Hello world 2"
newParameter: "newParam"
The structure passed to Ansible as extra variables is:
{ "meta": {
"name": "<cr_name>",
"namespace": "<cr_namespace>",
},
"message": "Hello world 2",
"new_parameter": "newParam",
"_app_example_com_database": {
<full_crd>
},
}
The message
and newParameter
fields are set in the top level as extra variables, and meta
provides the relevant metadata for the CR as defined in the Operator. The meta
fields can be accessed using dot notation in Ansible, for example:
- debug:
msg: "name: {{ meta.name }}, {{ meta.namespace }}"
Ansible Runner keeps information about Ansible runs in the container. This is located at /tmp/ansible-operator/runner/<group>/<version>/<kind>/<namespace>/<name>
.
To learn more about the runner
directory, see the Ansible Runner documentation.
This procedure walks through an example of building a simple Memcached Operator powered by Ansible playbooks and modules using tools and libraries provided by the Operator SDK.
Operator SDK v0.19.4 CLI installed on the development workstation
Access to a Kubernetes-based cluster v1.11.3+ (for example OpenShift Container Platform 4.6) using an account with cluster-admin
permissions
OpenShift CLI (oc
) v4.6+ installed
ansible
v2.9.0+
ansible-runner
v1.1.0+
ansible-runner-http
v1.0.0+
Create a new Operator project. A namespace-scoped Operator watches and manages resources in a single namespace. Namespace-scoped Operators are preferred because of their flexibility. They enable decoupled upgrades, namespace isolation for failures and monitoring, and differing API definitions.
To create a new Ansible-based, namespace-scoped memcached-operator
project and change to the new directory, use the following commands:
$ operator-sdk new memcached-operator \
--api-version=cache.example.com/v1alpha1 \
--kind=Memcached \
--type=ansible
$ cd memcached-operator
This creates the memcached-operator
project specifically for watching the Memcached
resource with API version example.com/v1apha1
and kind Memcached
.
Customize the Operator logic.
For this example, the memcached-operator
executes the following reconciliation logic for each Memcached
custom resource (CR):
Create a memcached
deployment if it does not exist.
Ensure that the deployment size is the same as specified by the Memcached
CR.
By default, the memcached-operator
watches Memcached
resource events as shown in the watches.yaml
file and executes the Ansible role Memcached
:
- version: v1alpha1
group: cache.example.com
kind: Memcached
You can optionally customize the following logic in the watches.yaml
file:
Specifying a role
option configures the Operator to use this specified path when launching ansible-runner
with an Ansible role. By default, the operator-sdk new
command fills in an absolute path to where your role should go:
- version: v1alpha1
group: cache.example.com
kind: Memcached
role: /opt/ansible/roles/memcached
Specifying a playbook
option in the watches.yaml
file configures the Operator to use this specified path when launching ansible-runner
with an Ansible playbook:
- version: v1alpha1
group: cache.example.com
kind: Memcached
playbook: /opt/ansible/playbook.yaml
Build the Memcached Ansible role.
Modify the generated Ansible role under the roles/memcached/
directory. This Ansible role controls the logic that is executed when a resource is modified.
Define the Memcached
spec.
Defining the spec for an Ansible-based Operator can be done entirely in Ansible. The Ansible Operator passes all key-value pairs listed in the CR spec field along to Ansible as variables. The names of all variables in the spec field are converted to snake case (lowercase with an underscore) by the Operator before running Ansible. For example, serviceAccount
in the spec becomes service_account
in Ansible.
You should perform some type validation in Ansible on the variables to ensure that your application is receiving expected input. |
In case the user does not set the spec
field, set a default by modifying the roles/memcached/defaults/main.yml
file:
size: 1
Define the Memcached
deployment.
With the Memcached
spec now defined, you can define what Ansible is actually executed on resource changes. Because this is an Ansible role, the default behavior executes the tasks in the roles/memcached/tasks/main.yml
file.
The goal is for Ansible to create a deployment if it does not exist, which runs the memcached:1.4.36-alpine
image. Ansible 2.7+ supports the k8s Ansible module, which this example leverages to control the deployment definition.
Modify the roles/memcached/tasks/main.yml
to match the following:
- name: start memcached
k8s:
definition:
kind: Deployment
apiVersion: apps/v1
metadata:
name: '{{ meta.name }}-memcached'
namespace: '{{ meta.namespace }}'
spec:
replicas: "{{size}}"
selector:
matchLabels:
app: memcached
template:
metadata:
labels:
app: memcached
spec:
containers:
- name: memcached
command:
- memcached
- -m=64
- -o
- modern
- -v
image: "docker.io/memcached:1.4.36-alpine"
ports:
- containerPort: 11211
This example used the |
Deploy the CRD.
Before running the Operator, Kubernetes needs to know about the new custom resource definition (CRD) that the Operator will be watching. Deploy the Memcached
CRD:
$ oc create -f deploy/crds/cache.example.com_memcacheds_crd.yaml
Build and run the Operator.
There are two ways to build and run the Operator:
As a pod inside a Kubernetes cluster.
As a Go program outside the cluster using the operator-sdk up
command.
Choose one of the following methods:
Run as a pod inside a Kubernetes cluster. This is the preferred method for production use.
Build the memcached-operator
image and push it to a registry:
$ operator-sdk build quay.io/example/memcached-operator:v0.0.1
$ podman push quay.io/example/memcached-operator:v0.0.1
Deployment manifests are generated in the deploy/operator.yaml
file. The deployment image in this file needs to be modified from the placeholder REPLACE_IMAGE
to the previous built image. To do this, run:
$ sed -i 's|REPLACE_IMAGE|quay.io/example/memcached-operator:v0.0.1|g' deploy/operator.yaml
Deploy the memcached-operator
manifests:
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc create -f deploy/operator.yaml
Verify that the memcached-operator
deployment is up and running:
$ oc get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
memcached-operator 1 1 1 1 1m
Run outside the cluster. This method is preferred during the development cycle to speed up deployment and testing.
Ensure that Ansible Runner and Ansible Runner HTTP Plug-in are installed or else you will see unexpected errors from Ansible Runner when a CR is created.
It is also important that the role path referenced in the watches.yaml
file exists on your machine. Because normally a container is used where the role is put on disk, the role must be manually copied to the configured Ansible roles path (for example /etc/ansible/roles
).
To run the Operator locally with the default Kubernetes configuration file present at $HOME/.kube/config
:
$ operator-sdk run --local
To run the Operator locally with a provided Kubernetes configuration file:
$ operator-sdk run --local --kubeconfig=config
Create a Memcached
CR.
Modify the deploy/crds/cache_v1alpha1_memcached_cr.yaml
file as shown and create a Memcached
CR:
$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
apiVersion: "cache.example.com/v1alpha1"
kind: "Memcached"
metadata:
name: "example-memcached"
spec:
size: 3
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Ensure that the memcached-operator
creates the deployment for the CR:
$ oc get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
memcached-operator 1 1 1 1 2m
example-memcached 3 3 3 3 1m
Check the pods to confirm three replicas were created:
$ oc get pods
NAME READY STATUS RESTARTS AGE
example-memcached-6fd7c98d8-7dqdr 1/1 Running 0 1m
example-memcached-6fd7c98d8-g5k7v 1/1 Running 0 1m
example-memcached-6fd7c98d8-m7vn7 1/1 Running 0 1m
memcached-operator-7cc7cfdf86-vvjqk 1/1 Running 0 2m
Update the size.
Change the spec.size
field in the memcached
CR from 3
to 4
and apply the change:
$ cat deploy/crds/cache_v1alpha1_memcached_cr.yaml
apiVersion: "cache.example.com/v1alpha1"
kind: "Memcached"
metadata:
name: "example-memcached"
spec:
size: 4
$ oc apply -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
Confirm that the Operator changes the deployment size:
$ oc get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
example-memcached 4 4 4 4 5m
Clean up the resources:
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_cr.yaml
$ oc delete -f deploy/operator.yaml
$ oc delete -f deploy/role_binding.yaml
$ oc delete -f deploy/role.yaml
$ oc delete -f deploy/service_account.yaml
$ oc delete -f deploy/crds/cache_v1alpha1_memcached_crd.yaml
To manage the lifecycle of your application on Kubernetes using Ansible, you can use the k8s
Ansible module. This Ansible module allows a developer to either leverage their existing Kubernetes resource files (written in YAML) or express the lifecycle management in native Ansible.
One of the biggest benefits of using Ansible in conjunction with existing Kubernetes resource files is the ability to use Jinja templating so that you can customize resources with the simplicity of a few variables in Ansible.
This section goes into detail on usage of the k8s
Ansible module. To get started, install the module on your local workstation and test it using a playbook before moving on to using it within an Operator.
To install the k8s
Ansible module on your local workstation:
Install Ansible 2.9+:
$ sudo yum install ansible
Install the OpenShift python client package using pip
:
$ sudo pip install openshift
$ sudo pip install kubernetes
Sometimes, it is beneficial for a developer to run the Ansible code from their local machine as opposed to running and rebuilding the Operator each time.
Install the community.kubernetes
collection:
$ ansible-galaxy collection install community.kubernetes
Initialize a new Ansible-based Operator project:
$ operator-sdk new --type ansible \
--kind Test1 \
--api-version test1.example.com/v1alpha1 test1-operator
Create test1-operator/tmp/init/galaxy-init.sh
Create test1-operator/tmp/build/Dockerfile
Create test1-operator/tmp/build/test-framework/Dockerfile
Create test1-operator/tmp/build/go-test.sh
Rendering Ansible Galaxy role [test1-operator/roles/test1]...
Cleaning up test1-operator/tmp/init
Create test1-operator/watches.yaml
Create test1-operator/deploy/rbac.yaml
Create test1-operator/deploy/crd.yaml
Create test1-operator/deploy/cr.yaml
Create test1-operator/deploy/operator.yaml
Run git init ...
Initialized empty Git repository in /home/user/go/src/github.com/user/opsdk/test1-operator/.git/
Run git init done
$ cd test1-operator
Modify the roles/test1/tasks/main.yml
file with the Ansible logic that you want. This example creates and deletes a namespace with the switch of a variable.
- name: set test namespace to "{{ state }}"
community.kubernetes.k8s:
api_version: v1
kind: Namespace
state: "{{ state }}"
name: test
ignore_errors: true (1)
1 | Setting ignore_errors: true ensures that deleting a nonexistent project does not fail. |
Modify the roles/test1/defaults/main.yml
file to set state
to present
by default:
state: present
Create an Ansible playbook playbook.yml
in the top-level directory, which includes the test1
role:
- hosts: localhost
roles:
- test1
Run the playbook:
$ ansible-playbook playbook.yml
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [localhost] ***************************************************************************
PROCEDURE [Gathering Facts] *********************************************************************
ok: [localhost]
Task [test1 : set test namespace to present]
changed: [localhost]
PLAY RECAP *********************************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0
Check that the namespace was created:
$ oc get namespace
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
test Active 3s
Rerun the playbook setting state
to absent
:
$ ansible-playbook playbook.yml --extra-vars state=absent
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [localhost] ***************************************************************************
PROCEDURE [Gathering Facts] *********************************************************************
ok: [localhost]
Task [test1 : set test namespace to absent]
changed: [localhost]
PLAY RECAP *********************************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0
Check that the namespace was deleted:
$ oc get namespace
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
After you are familiar with using the k8s
Ansible module locally, you can trigger the same Ansible logic inside of an Operator when a custom resource (CR) changes. This example maps an Ansible role to a specific Kubernetes resource that the Operator watches. This mapping is done in the watches.yaml
file.
After getting comfortable testing Ansible workflows locally, you can test the logic inside of an Ansible-based Operator running locally.
To do so, use the operator-sdk run --local
command from the top-level directory of your Operator project. This command reads from the watches.yaml
file and uses the ~/.kube/config
file to communicate with a Kubernetes cluster just as the k8s
Ansible module does.
Because the run --local
command reads from the watches.yaml
file, there are options available to the Operator author. If role
is left alone (by default, /opt/ansible/roles/<name>
) you must copy the role over to the /opt/ansible/roles/
directory from the Operator directly.
This is cumbersome because changes are not reflected from the current directory. Instead, change the role
field to point to the current directory and comment out the existing line:
- version: v1alpha1
group: test1.example.com
kind: Test1
# role: /opt/ansible/roles/Test1
role: /home/user/test1-operator/Test1
Create a custom resource definition (CRD) and proper role-based access control
(RBAC) definitions for the custom resource (CR) Test1
. The operator-sdk
command autogenerates these files inside of the deploy/
directory:
$ oc create -f deploy/crds/test1_v1alpha1_test1_crd.yaml
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
Run the run --local
command:
$ operator-sdk run --local
[...]
INFO[0000] Starting to serve on 127.0.0.1:8888
INFO[0000] Watching test1.example.com/v1alpha1, Test1, default
Now that the Operator is watching the resource Test1
for events, the creation of a CR triggers your Ansible role to execute. View the deploy/cr.yaml
file:
apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
Because the spec
field is not set, Ansible is invoked with no extra variables. The next section covers how extra variables are passed from a CR to Ansible. This is why it is important to set reasonable defaults for the Operator.
Create a CR instance of Test1
with the default variable state
set to present
:
$ oc create -f deploy/cr.yaml
Check that the namespace test
was created:
$ oc get namespace
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
test Active 3s
Modify the deploy/cr.yaml
file to set the state
field to absent
:
apiVersion: "test1.example.com/v1alpha1"
kind: "Test1"
metadata:
name: "example"
spec:
state: "absent"
Apply the changes and confirm that the namespace is deleted:
$ oc apply -f deploy/cr.yaml
$ oc get namespace
NAME STATUS AGE
default Active 28d
kube-public Active 28d
kube-system Active 28d
After getting familiar running Ansible logic inside of an Ansible-based Operator locally, you can test the Operator inside of a pod on a Kubernetes cluster, such as OpenShift Container Platform. Running as a pod on a cluster is preferred for production use.
Build the test1-operator
image and push it to a registry:
$ operator-sdk build quay.io/example/test1-operator:v0.0.1
$ podman push quay.io/example/test1-operator:v0.0.1
Deployment manifests are generated in the deploy/operator.yaml
file. The deployment image in this file must be modified from the placeholder REPLACE_IMAGE
to the previously-built image. To do so, run the following command:
$ sed -i 's|REPLACE_IMAGE|quay.io/example/test1-operator:v0.0.1|g' deploy/operator.yaml
If you are performing these steps on macOS, use the following command instead:
$ sed -i "" 's|REPLACE_IMAGE|quay.io/example/test1-operator:v0.0.1|g' deploy/operator.yaml
Deploy the test1-operator
:
$ oc create -f deploy/crds/test1_v1alpha1_test1_crd.yaml (1)
1 | Only required if the CRD does not exist already. |
$ oc create -f deploy/service_account.yaml
$ oc create -f deploy/role.yaml
$ oc create -f deploy/role_binding.yaml
$ oc create -f deploy/operator.yaml
Verify that the test1-operator
is up and running:
$ oc get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
test1-operator 1 1 1 1 1m
You can now view the Ansible logs for the test1-operator
:
$ oc logs deployment/test1-operator
operator_sdk.util
Ansible collectionAnsible-based Operators automatically update custom resource (CR) status
subresources with generic information about the previous Ansible run. This includes the number of successful and failed tasks and relevant error messages as shown:
status:
conditions:
- ansibleResult:
changed: 3
completion: 2018-12-03T13:45:57.13329
failures: 1
ok: 6
skipped: 0
lastTransitionTime: 2018-12-03T13:45:57Z
message: 'Status code was -1 and not [200]: Request failed: <urlopen error [Errno
113] No route to host>'
reason: Failed
status: "True"
type: Failure
- lastTransitionTime: 2018-12-03T13:46:13Z
message: Running reconciliation
reason: Running
status: "True"
type: Running
Ansible-based Operators also allow Operator authors to supply custom status values with the k8s_status
Ansible module, which is included in the operator_sdk.util
collection. This allows the author to update the status
from within Ansible with any key-value pair as desired.
By default, Ansible-based Operators always include the generic Ansible run output as shown above. If you would prefer your application did not update the status with Ansible output, you can track the status manually from your application.
To track CR status manually from your application, update the watches.yaml
file with a manageStatus
field set to false
:
- version: v1
group: api.example.com
kind: Test1
role: Test1
manageStatus: false
Use the operator_sdk.util.k8s_status
Ansible module to update the subresource. For example, to update with key test1
and value test2
, operator_sdk.util
can be used as shown:
- operator_sdk.util.k8s_status:
api_version: app.example.com/v1
kind: Test1
name: "{{ meta.name }}"
namespace: "{{ meta.namespace }}"
status:
test1: test2
Collections can also be declared in the meta/main.yml
for the role, which is included for new scaffolded Ansible Operators:
collections:
- operator_sdk.util
Declaring collections in the role meta allows you to invoke the k8s_status
module directly:
k8s_status:
<snip>
status:
test1: test2
For more details about user-driven status management from Ansible-based Operators, see the Ansible-based Operator Status Proposal for Operator SDK.
See Appendices to learn about the project directory structures created by the Operator SDK.
Reaching for the Stars with Ansible Operator - Red Hat OpenShift Blog