Security Recommendations

This document describes some security recommendations when deploying Antrea in a cluster, and in particular a multi-tenancy cluster.

To report a vulnerability in Antrea, please refer to SECURITY.md.

For information about securing Antrea control-plane communications, refer to this document.

Protecting Your Cluster Against Privilege Escalations

Antrea Agent

Like all other K8s Network Plugins, Antrea runs an agent (the Antrea Agent) on every Node on the cluster, using a K8s DaemonSet. And just like for other K8s Network Plugins, this agent requires a specific set of permissions which grant it access to the K8s API using RBAC. These permissions are required to implement the different features offered by Antrea. If any Node in the cluster happens to become compromised (e.g., by an escaped container) and the token for the antrea-agent ServiceAccount is harvested by the attacker, some of these permissions can be leveraged to negatively affect other workloads running on the cluster. In particular, the Antrea Agent is granted the following permissions:

  • patch the pods/status resources: a successful attacker could abuse this permission to re-label Pods to facilitate confused deputy attacks against built-in controllers. For example, making a Pod match a Service selector in order to man-in-the-middle (MITM) the Service traffic, or making a Pod match a ReplicaSet selector so that the ReplicaSet controller deletes legitimate replicas.
  • patch the nodes/status resources: a successful attacker could abuse this permission to affect scheduling by modifying Node fields like labels, capacity, and conditions.

In both cases, the Antrea Agent only requires the ability to mutate the annotations field for all Pods and Nodes, but with K8s RBAC, the lowest permission level that we can grant the Antrea Agent to satisfy this requirement is the patch verb for the status subresource for Pods and Nodes (which also provides the ability to mutate labels).

To mitigate the risk presented by these permissions in case of a compromised token, we suggest that you use Gatekeeper, with the appropriate policy. We provide the following Gatekeeper policy, consisting of a ConstraintTemplate and the corresponding Constraint. When using this policy, it will no longer be possible for the antrea-agent ServiceAccount to mutate anything besides annotations for the Pods and Nodes resources.

  1. # ConstraintTemplate
  2. apiVersion: templates.gatekeeper.sh/v1
  3. kind: ConstraintTemplate
  4. metadata:
  5. name: antreaagentstatusupdates
  6. annotations:
  7. description: >-
  8. Disallows unauthorized updates to status subresource by Antrea Agent
  9. Only annotations can be mutated
  10. spec:
  11. crd:
  12. spec:
  13. names:
  14. kind: AntreaAgentStatusUpdates
  15. targets:
  16. - target: admission.k8s.gatekeeper.sh
  17. rego: |
  18. package antreaagentstatusupdates
  19. username := object.get(input.review.userInfo, "username", "")
  20. targetUsername := "system:serviceaccount:kube-system:antrea-agent"
  21. allowed_mutation(object, oldObject) {
  22. object.status == oldObject.status
  23. object.metadata.labels == oldObject.metadata.labels
  24. }
  25. violation[{"msg": msg}] {
  26. username == targetUsername
  27. input.review.operation == "UPDATE"
  28. input.review.requestSubResource == "status"
  29. not allowed_mutation(input.review.object, input.review.oldObject)
  30. msg := "Antrea Agent is not allowed to mutate this field"
  31. }
  1. # Constraint
  2. apiVersion: constraints.gatekeeper.sh/v1beta1
  3. kind: AntreaAgentStatusUpdates
  4. metadata:
  5. name: antrea-agent-status-updates
  6. spec:
  7. match:
  8. kinds:
  9. - apiGroups: [""]
  10. kinds: ["Pod", "Node"]

Please ensure that the ValidatingWebhookConfiguration for your Gatekeeper installation enables policies to be applied on the pods/status and nodes/status subresources, which may not be the case by default.

As a reference, the following ValidatingWebhookConfiguration rule will cause policies to be applied to all resources and their subresources:

  1. - apiGroups:
  2. - '*'
  3. apiVersions:
  4. - '*'
  5. operations:
  6. - CREATE
  7. - UPDATE
  8. resources:
  9. - '*/*'
  10. scope: '*'

while the following rule will cause policies to be applied to all resources, but not their subresources:

  1. - apiGroups:
  2. - '*'
  3. apiVersions:
  4. - '*'
  5. operations:
  6. - CREATE
  7. - UPDATE
  8. resources:
  9. - '*'
  10. scope: '*'

Antrea Controller

The Antrea Controller, which runs as a single-replica Deployment, enjoys higher level permissions than the Antrea Agent. We recommend for production clusters running Antrea to schedule the antrea-controller Pod on a “secure” Node, which could for example be the Node (or one of the Nodes) running the K8s control-plane.

Protecting Access to Antrea Configuration Files

Antrea relies on persisting files on each K8s Node’s filesystem, in order to minimize disruptions to network functions across Antrea Agent restarts, in particular during an upgrade. All these files are located under /var/run/antrea/. The most notable of these files is /var/run/antrea/openvswitch/conf.db, which stores the Open vSwitch database. Prior to Antrea v0.10, any user had read access to the file on the host (permissions were set to 0644). Starting with v0.10, this is no longer the case (permissions are now set to 0640). Starting with v0.13, we further remove access to the /var/run/antrea/ directory for non-root users (permissions are set to 0750).

If a malicious Pod can gain read access to this file, or, prior to Antrea v0.10, if an attacker can gain access to the host, they can potentially access sensitive information stored in the database, most notably the Pre-Shared Key (PSK) used to configure IPsec tunnels, which is stored in plaintext in the database. If a PSK is leaked, an attacker can mount a man-in-the-middle attack and intercept tunnel traffic.

If a malicious Pod can gain write access to this file, it can modify the contents of the database, and therefore impact network functions.

Administrators of multi-tenancy clusters running Antrea should take steps to restrict the access of Pods to /var/run/antrea/. One way to achieve this is to use a PodSecurityPolicy and restrict the set of allowed volumes to exclude hostPath. This guidance applies to all multi-tenancy clusters and is not specific to Antrea. To quote the K8s documentation:

There are many ways a container with unrestricted access to the host filesystem can escalate privileges, including reading data from other containers, and abusing the credentials of system services, such as Kubelet.

An alternative solution to K8s PodSecurityPolicies is to use Gatekeeper to constrain usage of the host filesystem by Pods.