Common recommendations and suggestions

Common recommendations and suggestions to built solutions with Operator SDK

Overview

Any recommendations or best practices suggested by the Kubernetes community, such as how to develop Operator pattern solutions or how to use controller-runtime are good recommendations for those who are looking to build operator projects with operator-sdk. Also, see Operator Best Practices. However, here are some common recommendations.

Common Recommendations

Develop idempotent reconciliation solutions

When developing operators, it is essential for the controller’s reconciliation loop to be idempotent. By following the Operator pattern you will create Controllers which provide a reconcile function responsible for synchronizing resources until the desired state is reached on the cluster. Breaking this recommendation goes against the design principles of controller-runtime and may lead to unforeseen consequences such as resources becoming stuck and requiring manual intervention.

Understanding Kubernetes APIs

Building your own operator commonly involves extending the Kubernetes API itself. It is helpful to understand exactly how Custom Resource Definitions interact with the Kubernetes API. Also, the Kubebuilder documentation on Groups and Versions and Kinds may be helpful to better understand these concepts as they relate to operators.

Avoid a design solution where more than one Kind is reconciled by the same controller

Having many Kinds (such as CRDs) which are all managed by the same controller usually goes against the design proposed by controller-runtime. Furthermore this might hurt concepts such as encapsulation, the Single Responsibility Principle, and Cohesion. Damaging these concepts may cause unexpected side effects, and increase the difficulty of extending, reusing, or maintaining the operator.

Ideally Operators does not manage other Operators

From best practices:

  • “Operators should own a CRD and only one Operator should control a CRD on a cluster. Two Operators managing the same CRD is not a recommended best practice. In the case where an API exists but with multiple implementations, this is typically an example of a no-op Operator because it doesn’t have any deployment or reconciliation loop to define the shared API and other Operators depend on this Operator to provide one implementation of the API, e.g. similar to PVCs or Ingress.”

  • “An Operator shouldn’t deploy or manage other operators (such patterns are known as meta or super operators or include CRDs in its Operands). It’s the Operator Lifecycle Manager’s job to manage the deployment and lifecycle of operators. For further information check Dependency Resolution.”

What does it mainly mean:

  • If you want to define that your Operator depends on APIs which are owned by another Operator or on another whole Operator itself you should use Operator Lifecycle Manager’s Dependency Resolution
  • If you want to reconcile core APIs (defined by Kubernetes) or External APIs (defined from other operators) you should not re-define the API as owned by your project. Therefore, you can create the controller in this cases by using the flag --resource=false. (i.e. $ operator-sdk create api --group ship --version v1beta1 --kind External --resource=false --controller=true). Attention: If you are using Golang-based language Operator then, you will need to update the markers and imports manually until it become officially supported by the tool. For further information check the issue #1999.

WARNING: if you create CRD’s via the reconciliations or via the Operands then, OLM cannot handle CRDs migration and update, validation.

NOTE: By not following this guidance you might probably to be hurting concepts like as single responsibility principle and damaging these concepts could cause unexpected side effects, such as; difficulty extending, reuse, or maintenance, only to mention a few.

Other common suggestions

  • Provide the images and tags used by the operator solution via environment variables in the config/manager/manager.yaml:
  1. ...
  2. spec:
  3. ...
  4. spec:
  5. ...
  6. containers:
  7. - command:
  8. - /manager
  9. ...
  10. env:
  11. - name: MY_IMAGE
  12. value: "quay.io/example.com/image:0.0.1"

Last modified May 19, 2022: :book: clarifies best practices and highlited the scenario (#5768) (2c39ee1a)