Allowing non-cluster administrators to install Operators

Operators can require wide privileges to run, and the required privileges can change between versions. Operator Lifecycle Manager (OLM) runs with cluster-admin privileges. By default, Operator authors can specify any set of permissions in the cluster service version (CSV) and OLM will consequently grant it to the Operator.

Cluster administrators should take measures to ensure that an Operator cannot achieve cluster-scoped privileges and that users cannot escalate privileges using OLM. One method for locking this down requires cluster administrators auditing Operators before they are added to the cluster. Cluster administrators are also provided tools for determining and constraining which actions are allowed during an Operator installation or upgrade using service accounts.

By associating an Operator group with a service account that has a set of privileges granted to it, cluster administrators can set policy on Operators to ensure they operate only within predetermined boundaries using RBAC rules. The Operator is unable to do anything that is not explicitly permitted by those rules.

This self-sufficient, limited scope installation of Operators by non-cluster administrators means that more of the Operator Framework tools can safely be made available to more users, providing a richer experience for building applications with Operators.

Understanding Operator installation policy

Using Operator Lifecycle Manager (OLM), cluster administrators can choose to specify a service account for an Operator group so that all Operators associated with the group are deployed and run against the privileges granted to the service account.

The APIService and CustomResourceDefinition resources are always created by OLM using the cluster-admin role. A service account associated with an Operator group should never be granted privileges to write these resources.

If the specified service account does not have adequate permissions for an Operator that is being installed or upgraded, useful and contextual information is added to the status of the respective resource(s) so that it is easy for the cluster administrator to troubleshoot and resolve the issue.

Any Operator tied to this Operator group is now confined to the permissions granted to the specified service account. If the Operator asks for permissions that are outside the scope of the service account, the install fails with appropriate errors.

Installation scenarios

When determining whether an Operator can be installed or upgraded on a cluster, Operator Lifecycle Manager (OLM) considers the following scenarios:

  • A cluster administrator creates a new Operator group and specifies a service account. All Operator(s) associated with this Operator group are installed and run against the privileges granted to the service account.

  • A cluster administrator creates a new Operator group and does not specify any service account. OKD maintains backward compatibility, so the default behavior remains and Operator installs and upgrades are permitted.

  • For existing Operator groups that do not specify a service account, the default behavior remains and Operator installs and upgrades are permitted.

  • A cluster administrator updates an existing Operator group and specifies a service account. OLM allows the existing Operator to continue to run with their current privileges. When such an existing Operator is going through an upgrade, it is reinstalled and run against the privileges granted to the service account like any new Operator.

  • A service account specified by an Operator group changes by adding or removing permissions, or the existing service account is swapped with a new one. When existing Operators go through an upgrade, it is reinstalled and run against the privileges granted to the updated service account like any new Operator.

  • A cluster administrator removes the service account from an Operator group. The default behavior remains and Operator installs and upgrades are permitted.

Installation workflow

When an Operator group is tied to a service account and an Operator is installed or upgraded, Operator Lifecycle Manager (OLM) uses the following workflow:

  1. The given Subscription object is picked up by OLM.

  2. OLM fetches the Operator group tied to this subscription.

  3. OLM determines that the Operator group has a service account specified.

  4. OLM creates a client scoped to the service account and uses the scoped client to install the Operator. This ensures that any permission requested by the Operator is always confined to that of the service account in the Operator group.

  5. OLM creates a new service account with the set of permissions specified in the CSV and assigns it to the Operator. The Operator runs as the assigned service account.

Scoping Operator installations

To provide scoping rules to Operator installations and upgrades on Operator Lifecycle Manager (OLM), associate a service account with an Operator group.

Using this example, a cluster administrator can confine a set of Operators to a designated namespace.

Procedure

  1. Create a new namespace:

    1. $ cat <<EOF | oc create -f -
    2. apiVersion: v1
    3. kind: Namespace
    4. metadata:
    5. name: scoped
    6. EOF
  2. Allocate permissions that you want the Operator(s) to be confined to. This involves creating a new service account, relevant role(s), and role binding(s).

    1. $ cat <<EOF | oc create -f -
    2. apiVersion: v1
    3. kind: ServiceAccount
    4. metadata:
    5. name: scoped
    6. namespace: scoped
    7. EOF

    The following example grants the service account permissions to do anything in the designated namespace for simplicity. In a production environment, you should create a more fine-grained set of permissions:

    1. $ cat <<EOF | oc create -f -
    2. apiVersion: rbac.authorization.k8s.io/v1
    3. kind: Role
    4. metadata:
    5. name: scoped
    6. namespace: scoped
    7. rules:
    8. - apiGroups: ["*"]
    9. resources: ["*"]
    10. verbs: ["*"]
    11. ---
    12. apiVersion: rbac.authorization.k8s.io/v1
    13. kind: RoleBinding
    14. metadata:
    15. name: scoped-bindings
    16. namespace: scoped
    17. roleRef:
    18. apiGroup: rbac.authorization.k8s.io
    19. kind: Role
    20. name: scoped
    21. subjects:
    22. - kind: ServiceAccount
    23. name: scoped
    24. namespace: scoped
    25. EOF
  3. Create an OperatorGroup object in the designated namespace. This Operator group targets the designated namespace to ensure that its tenancy is confined to it.

    In addition, Operator groups allow a user to specify a service account. Specify the service account created in the previous step:

    1. $ cat <<EOF | oc create -f -
    2. apiVersion: operators.coreos.com/v1
    3. kind: OperatorGroup
    4. metadata:
    5. name: scoped
    6. namespace: scoped
    7. spec:
    8. serviceAccountName: scoped
    9. targetNamespaces:
    10. - scoped
    11. EOF

    Any Operator installed in the designated namespace is tied to this Operator group and therefore to the service account specified.

  4. Create a Subscription object in the designated namespace to install an Operator:

    1. $ cat <<EOF | oc create -f -
    2. apiVersion: operators.coreos.com/v1alpha1
    3. kind: Subscription
    4. metadata:
    5. name: etcd
    6. namespace: scoped
    7. spec:
    8. channel: singlenamespace-alpha
    9. name: etcd
    10. source: <catalog_source_name> (1)
    11. sourceNamespace: <catalog_source_namespace> (2)
    12. EOF
    1Specify a catalog source that already exists in the designated namespace or one that is in the global catalog namespace.
    2Specify a namespace where the catalog source was created.

    Any Operator tied to this Operator group is confined to the permissions granted to the specified service account. If the Operator requests permissions that are outside the scope of the service account, the installation fails with relevant errors.

Fine-grained permissions

Operator Lifecycle Manager (OLM) uses the service account specified in an Operator group to create or update the following resources related to the Operator being installed:

  • ClusterServiceVersion

  • Subscription

  • Secret

  • ServiceAccount

  • Service

  • ClusterRole and ClusterRoleBinding

  • Role and RoleBinding

In order to confine Operators to a designated namespace, cluster administrators can start by granting the following permissions to the service account:

The following role is a generic example and additional rules might be required based on the specific Operator.

  1. kind: Role
  2. rules:
  3. - apiGroups: ["operators.coreos.com"]
  4. resources: ["subscriptions", "clusterserviceversions"]
  5. verbs: ["get", "create", "update", "patch"]
  6. - apiGroups: [""]
  7. resources: ["services", "serviceaccounts"]
  8. verbs: ["get", "create", "update", "patch"]
  9. - apiGroups: ["rbac.authorization.k8s.io"]
  10. resources: ["roles", "rolebindings"]
  11. verbs: ["get", "create", "update", "patch"]
  12. - apiGroups: ["apps"] (1)
  13. resources: ["deployments"]
  14. verbs: ["list", "watch", "get", "create", "update", "patch", "delete"]
  15. - apiGroups: [""] (1)
  16. resources: ["pods"]
  17. verbs: ["list", "watch", "get", "create", "update", "patch", "delete"]
1Add permissions to create other resources, such as deployments and pods shown here.

In addition, if any Operator specifies a pull secret, the following permissions must also be added:

  1. kind: ClusterRole (1)
  2. rules:
  3. - apiGroups: [""]
  4. resources: ["secrets"]
  5. verbs: ["get"]
  6. ---
  7. kind: Role
  8. rules:
  9. - apiGroups: [""]
  10. resources: ["secrets"]
  11. verbs: ["create", "update", "patch"]
1Required to get the secret from the OLM namespace.

Troubleshooting permission failures

If an Operator installation fails due to lack of permissions, identify the errors using the following procedure.

Procedure

  1. Review the Subscription object. Its status has an object reference installPlanRef that points to the InstallPlan object that attempted to create the necessary [Cluster]Role[Binding] object(s) for the Operator:

    1. apiVersion: operators.coreos.com/v1
    2. kind: Subscription
    3. metadata:
    4. name: etcd
    5. namespace: scoped
    6. status:
    7. installPlanRef:
    8. apiVersion: operators.coreos.com/v1
    9. kind: InstallPlan
    10. name: install-4plp8
    11. namespace: scoped
    12. resourceVersion: "117359"
    13. uid: 2c1df80e-afea-11e9-bce3-5254009c9c23
  2. Check the status of the InstallPlan object for any errors:

    1. apiVersion: operators.coreos.com/v1
    2. kind: InstallPlan
    3. status:
    4. conditions:
    5. - lastTransitionTime: "2019-07-26T21:13:10Z"
    6. lastUpdateTime: "2019-07-26T21:13:10Z"
    7. message: 'error creating clusterrole etcdoperator.v0.9.4-clusterwide-dsfx4: clusterroles.rbac.authorization.k8s.io
    8. is forbidden: User "system:serviceaccount:scoped:scoped" cannot create resource
    9. "clusterroles" in API group "rbac.authorization.k8s.io" at the cluster scope'
    10. reason: InstallComponentFailed
    11. status: "False"
    12. type: Installed
    13. phase: Failed

    The error message tells you:

    • The type of resource it failed to create, including the API group of the resource. In this case, it was clusterroles in the rbac.authorization.k8s.io group.

    • The name of the resource.

    • The type of error: is forbidden tells you that the user does not have enough permission to do the operation.

    • The name of the user who attempted to create or update the resource. In this case, it refers to the service account specified in the Operator group.

    • The scope of the operation: cluster scope or not.

      The user can add the missing permission to the service account and then iterate.

      Operator Lifecycle Manager (OLM) does not currently provide the complete list of errors on the first try.