Pod Security Standards

Security settings for Pods are typically applied by using security contexts. Security Contexts allow for the definition of privilege and access controls on a per-Pod basis.

The enforcement and policy-based definition of cluster requirements of security contexts has previously been achieved using Pod Security Policy. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the Pod specification.

However, numerous means of policy enforcement have arisen that augment or replace the use of PodSecurityPolicy. The intent of this page is to detail recommended Pod security profiles, decoupled from any specific instantiation.

Policy Types

There is an immediate need for base policy definitions to broadly cover the security spectrum. These should range from highly restricted to highly flexible:

  • Privileged - Unrestricted policy, providing the widest possible level of permissions. This policy allows for known privilege escalations.
  • Baseline - Minimally restrictive policy while preventing known privilege escalations. Allows the default (minimally specified) Pod configuration.
  • Restricted - Heavily restricted policy, following current Pod hardening best practices.

Policies

Privileged

The Privileged policy is purposely-open, and entirely unrestricted. This type of policy is typically aimed at system- and infrastructure-level workloads managed by privileged, trusted users.

The privileged policy is defined by an absence of restrictions. For allow-by-default enforcement mechanisms (such as gatekeeper), the privileged profile may be an absence of applied constraints rather than an instantiated policy. In contrast, for a deny-by-default mechanism (such as Pod Security Policy) the privileged policy should enable all controls (disable all restrictions).

Baseline

The Baseline policy is aimed at ease of adoption for common containerized workloads while preventing known privilege escalations. This policy is targeted at application operators and developers of non-critical applications. The following listed controls should be enforced/disallowed:

Baseline policy specification
ControlPolicy
Host NamespacesSharing the host namespaces must be disallowed.

Restricted Fields:
spec.hostNetwork
spec.hostPID
spec.hostIPC

Allowed Values: false
Privileged ContainersPrivileged Pods disable most security mechanisms and must be disallowed.

Restricted Fields:
spec.containers[].securityContext.privileged
spec.initContainers[
].securityContext.privileged

Allowed Values: false, undefined/nil
CapabilitiesAdding additional capabilities beyond the default set must be disallowed.

Restricted Fields:
spec.containers[].securityContext.capabilities.add
spec.initContainers[
].securityContext.capabilities.add

Allowed Values: empty (or restricted to a known list)
HostPath VolumesHostPath volumes must be forbidden.

Restricted Fields:
spec.volumes[].hostPath

Allowed Values: undefined/nil
Host PortsHostPorts should be disallowed, or at minimum restricted to a known list.

Restricted Fields:
spec.containers[].ports[].hostPort
spec.initContainers[
].ports[].hostPort

Allowed Values: 0, undefined (or restricted to a known list)
AppArmor (optional)On supported hosts, the ‘runtime/default’ AppArmor profile is applied by default. The baseline policy should prevent overriding or disabling the default AppArmor profile, or restrict overrides to an allowed set of profiles.

Restricted Fields:
metadata.annotations[‘container.apparmor.security.beta.kubernetes.io/‘]

Allowed Values: ‘runtime/default’, undefined
SELinux (optional)Setting custom SELinux options should be disallowed.

Restricted Fields:
spec.securityContext.seLinuxOptions
spec.containers[].securityContext.seLinuxOptions
spec.initContainers[
].securityContext.seLinuxOptions

Allowed Values: undefined/nil
/proc Mount TypeThe default /proc masks are set up to reduce attack surface, and should be required.

Restricted Fields:
spec.containers[].securityContext.procMount
spec.initContainers[
].securityContext.procMount

Allowed Values: undefined/nil, ‘Default’
SysctlsSysctls can disable security mechanisms or affect all containers on a host, and should be disallowed except for an allowed “safe” subset. A sysctl is considered safe if it is namespaced in the container or the Pod, and it is isolated from other Pods or processes on the same Node.

Restricted Fields:
spec.securityContext.sysctls

Allowed Values:
kernel.shm_rmid_forced
net.ipv4.ip_local_port_range
net.ipv4.tcp_syncookies
net.ipv4.ping_group_range
undefined/empty

Restricted

The Restricted policy is aimed at enforcing current Pod hardening best practices, at the expense of some compatibility. It is targeted at operators and developers of security-critical applications, as well as lower-trust users.The following listed controls should be enforced/disallowed:

Restricted policy specification
ControlPolicy
Everything from the baseline profile.
Volume TypesIn addition to restricting HostPath volumes, the restricted profile limits usage of non-core volume types to those defined through PersistentVolumes.

Restricted Fields:
spec.volumes[].hostPath
spec.volumes[
].gcePersistentDisk
spec.volumes[].awsElasticBlockStore
spec.volumes[
].gitRepo
spec.volumes[].nfs
spec.volumes[
].iscsi
spec.volumes[].glusterfs
spec.volumes[
].rbd
spec.volumes[].flexVolume
spec.volumes[
].cinder
spec.volumes[].cephFS
spec.volumes[
].flocker
spec.volumes[].fc
spec.volumes[
].azureFile
spec.volumes[].vsphereVolume
spec.volumes[
].quobyte
spec.volumes[].azureDisk
spec.volumes[
].portworxVolume
spec.volumes[].scaleIO
spec.volumes[
].storageos
spec.volumes[].csi

Allowed Values: undefined/nil
Privilege EscalationPrivilege escalation (such as via set-user-ID or set-group-ID file mode) should not be allowed.

Restricted Fields:
spec.containers[].securityContext.allowPrivilegeEscalation
spec.initContainers[].securityContext.allowPrivilegeEscalation

Allowed Values: false
Running as Non-rootContainers must be required to run as non-root users.

Restricted Fields:
spec.securityContext.runAsNonRoot
spec.containers[].securityContext.runAsNonRoot
spec.initContainers[].securityContext.runAsNonRoot

Allowed Values: true
Non-root groups (optional)Containers should be forbidden from running with a root primary or supplementary GID.

Restricted Fields:
spec.securityContext.runAsGroup
spec.securityContext.supplementalGroups[]
spec.securityContext.fsGroup
spec.containers[].securityContext.runAsGroup
spec.initContainers[
].securityContext.runAsGroup

Allowed Values:
non-zero
undefined / nil (except for *.runAsGroup)
SeccompThe RuntimeDefault seccomp profile must be required, or allow specific additional profiles.

Restricted Fields:
spec.securityContext.seccompProfile.type
spec.containers[].securityContext.seccompProfile
spec.initContainers[
].securityContext.seccompProfile

Allowed Values:
‘runtime/default’
undefined / nil

Policy Instantiation

Decoupling policy definition from policy instantiation allows for a common understanding and consistent language of policies across clusters, independent of the underlying enforcement mechanism.

As mechanisms mature, they will be defined below on a per-policy basis. The methods of enforcement of individual policies are not defined here.

PodSecurityPolicy

FAQ

Why isn’t there a profile between privileged and baseline?

The three profiles defined here have a clear linear progression from most secure (restricted) to least secure (privileged), and cover a broad set of workloads. Privileges required above the baseline policy are typically very application specific, so we do not offer a standard profile in this niche. This is not to say that the privileged profile should always be used in this case, but that policies in this space need to be defined on a case-by-case basis.

SIG Auth may reconsider this position in the future, should a clear need for other profiles arise.

What’s the difference between a security policy and a security context?

Security Contexts configure Pods and Containers at runtime. Security contexts are defined as part of the Pod and container specifications in the Pod manifest, and represent parameters to the container runtime.

Security policies are control plane mechanisms to enforce specific settings in the Security Context, as well as other parameters outside the Security Context. As of February 2020, the current native solution for enforcing these security policies is Pod Security Policy - a mechanism for centrally enforcing security policy on Pods across a cluster. Other alternatives for enforcing security policy are being developed in the Kubernetes ecosystem, such as OPA Gatekeeper.

What profiles should I apply to my Windows Pods?

Windows in Kubernetes has some limitations and differentiators from standard Linux-based workloads. Specifically, the Pod SecurityContext fields have no effect on Windows. As such, no standardized Pod Security profiles currently exists.

What about sandboxed Pods?

There is not currently an API standard that controls whether a Pod is considered sandboxed or not. Sandbox Pods may be identified by the use of a sandboxed runtime (such as gVisor or Kata Containers), but there is no standard definition of what a sandboxed runtime is.

The protections necessary for sandboxed workloads can differ from others. For example, the need to restrict privileged permissions is lessened when the workload is isolated from the underlying kernel. This allows for workloads requiring heightened permissions to still be isolated.

Additionally, the protection of sandboxed workloads is highly dependent on the method of sandboxing. As such, no single recommended policy is recommended for all sandboxed workloads.