Windows containers in Kubernetes
Windows applications constitute a large portion of the services and applications that run in many organizations. Windows containers provide a way to encapsulate processes and package dependencies, making it easier to use DevOps practices and follow cloud native patterns for Windows applications.
Organizations with investments in Windows-based applications and Linux-based applications don’t have to look for separate orchestrators to manage their workloads, leading to increased operational efficiencies across their deployments, regardless of operating system.
Windows nodes in Kubernetes
To enable the orchestration of Windows containers in Kubernetes, include Windows nodes in your existing Linux cluster. Scheduling Windows containers in Pods on Kubernetes is similar to scheduling Linux-based containers.
In order to run Windows containers, your Kubernetes cluster must include multiple operating systems. While you can only run the control plane on Linux, you can deploy worker nodes running either Windows or Linux.
Windows nodes are supported provided that the operating system is Windows Server 2019.
This document uses the term Windows containers to mean Windows containers with process isolation. Kubernetes does not support running Windows containers with Hyper-V isolation.
Compatibility and limitations
Some node features are only available if you use a specific container runtime; others are not available on Windows nodes, including:
- HugePages: not supported for Windows containers
- Privileged containers: not supported for Windows containers. HostProcess Containers offer similar functionality.
- TerminationGracePeriod: requires containerD
Not all features of shared namespaces are supported. See API compatibility for more details.
See Windows OS version compatibility for details on the Windows versions that Kubernetes is tested against.
From an API and kubectl perspective, Windows containers behave in much the same way as Linux-based containers. However, there are some notable differences in key functionality which are outlined in this section.
Comparison with Linux
Key Kubernetes elements work the same way in Windows as they do in Linux. This section refers to several key workload abstractions and how they map to Windows.
-
A Pod is the basic building block of Kubernetes–the smallest and simplest unit in the Kubernetes object model that you create or deploy. You may not deploy Windows and Linux containers in the same Pod. All containers in a Pod are scheduled onto a single Node where each Node represents a specific platform and architecture. The following Pod capabilities, properties and events are supported with Windows containers:
Single or multiple containers per Pod with process isolation and volume sharing
Pod
status
fieldsReadiness, liveness, and startup probes
postStart & preStop container lifecycle hooks
ConfigMap, Secrets: as environment variables or volumes
emptyDir
volumesNamed pipe host mounts
Resource limits
OS field:
The
.spec.os.name
field should be set towindows
to indicate that the current Pod uses Windows containers.Note: Starting from 1.25, the
IdentifyPodOS
feature gate is in GA stage and defaults to be enabled.If you set the
.spec.os.name
field towindows
, you must not set the following fields in the.spec
of that Pod:spec.hostPID
spec.hostIPC
spec.securityContext.seLinuxOptions
spec.securityContext.seccompProfile
spec.securityContext.fsGroup
spec.securityContext.fsGroupChangePolicy
spec.securityContext.sysctls
spec.shareProcessNamespace
spec.securityContext.runAsUser
spec.securityContext.runAsGroup
spec.securityContext.supplementalGroups
spec.containers[*].securityContext.seLinuxOptions
spec.containers[*].securityContext.seccompProfile
spec.containers[*].securityContext.capabilities
spec.containers[*].securityContext.readOnlyRootFilesystem
spec.containers[*].securityContext.privileged
spec.containers[*].securityContext.allowPrivilegeEscalation
spec.containers[*].securityContext.procMount
spec.containers[*].securityContext.runAsUser
spec.containers[*].securityContext.runAsGroup
In the above list, wildcards (
*
) indicate all elements in a list. For example,spec.containers[*].securityContext
refers to the SecurityContext object for all containers. If any of these fields is specified, the Pod will not be admitted by the API server.
Workload resources including:
- ReplicaSet
- Deployment
- StatefulSet
- DaemonSet
- Job
- CronJob
- ReplicationController
- Services See Load balancing and Services for more details.
Pods, workload resources, and Services are critical elements to managing Windows workloads on Kubernetes. However, on their own they are not enough to enable the proper lifecycle management of Windows workloads in a dynamic cloud native environment.
kubectl exec
- Pod and container metrics
- Horizontal pod autoscaling
- Resource quotas
- Scheduler preemption
Command line options for the kubelet
Some kubelet command line options behave differently on Windows, as described below:
- The
--windows-priorityclass
lets you set the scheduling priority of the kubelet process (see CPU resource management) - The
--kube-reserved
,--system-reserved
, and--eviction-hard
flags update NodeAllocatable - Eviction by using
--enforce-node-allocable
is not implemented - Eviction by using
--eviction-hard
and--eviction-soft
are not implemented - When running on a Windows node the kubelet does not have memory or CPU restrictions.
--kube-reserved
and--system-reserved
only subtract fromNodeAllocatable
and do not guarantee resource provided for workloads. See Resource Management for Windows nodes for more information. - The
MemoryPressure
Condition is not implemented - The kubelet does not take OOM eviction actions
API compatibility
There are subtle differences in the way the Kubernetes APIs work for Windows due to the OS and container runtime. Some workload properties were designed for Linux, and fail to run on Windows.
At a high level, these OS concepts are different:
- Identity - Linux uses userID (UID) and groupID (GID) which are represented as integer types. User and group names are not canonical - they are just an alias in
/etc/groups
or/etc/passwd
back to UID+GID. Windows uses a larger binary security identifier (SID) which is stored in the Windows Security Access Manager (SAM) database. This database is not shared between the host and containers, or between containers. - File permissions - Windows uses an access control list based on (SIDs), whereas POSIX systems such as Linux use a bitmask based on object permissions and UID+GID, plus optional access control lists.
- File paths - the convention on Windows is to use
\
instead of/
. The Go IO libraries typically accept both and just make it work, but when you’re setting a path or command line that’s interpreted inside a container,\
may be needed. - Signals - Windows interactive apps handle termination differently, and can implement one or more of these:
- A UI thread handles well-defined messages including
WM_CLOSE
. - Console apps handle Ctrl-C or Ctrl-break using a Control Handler.
- Services register a Service Control Handler function that can accept
SERVICE_CONTROL_STOP
control codes.
- A UI thread handles well-defined messages including
Container exit codes follow the same convention where 0 is success, and nonzero is failure. The specific error codes may differ across Windows and Linux. However, exit codes passed from the Kubernetes components (kubelet, kube-proxy) are unchanged.
Field compatibility for container specifications
The following list documents differences between how Pod container specifications work between Windows and Linux:
- Huge pages are not implemented in the Windows container runtime, and are not available. They require asserting a user privilege that’s not configurable for containers.
requests.cpu
andrequests.memory
- requests are subtracted from node available resources, so they can be used to avoid overprovisioning a node. However, they cannot be used to guarantee resources in an overprovisioned node. They should be applied to all containers as a best practice if the operator wants to avoid overprovisioning entirely.securityContext.allowPrivilegeEscalation
- not possible on Windows; none of the capabilities are hooked upsecurityContext.capabilities
- POSIX capabilities are not implemented on WindowssecurityContext.privileged
- Windows doesn’t support privileged containers, use HostProcess Containers insteadsecurityContext.procMount
- Windows doesn’t have a/proc
filesystemsecurityContext.readOnlyRootFilesystem
- not possible on Windows; write access is required for registry & system processes to run inside the containersecurityContext.runAsGroup
- not possible on Windows as there is no GID supportsecurityContext.runAsNonRoot
- this setting will prevent containers from running asContainerAdministrator
which is the closest equivalent to a root user on Windows.securityContext.runAsUser
- use runAsUserName insteadsecurityContext.seLinuxOptions
- not possible on Windows as SELinux is Linux-specificterminationMessagePath
- this has some limitations in that Windows doesn’t support mapping single files. The default value is/dev/termination-log
, which does work because it does not exist on Windows by default.
Field compatibility for Pod specifications
The following list documents differences between how Pod specifications work between Windows and Linux:
hostIPC
andhostpid
- host namespace sharing is not possible on WindowshostNetwork
- see belowdnsPolicy
- setting the PoddnsPolicy
toClusterFirstWithHostNet
is not supported on Windows because host networking is not provided. Pods always run with a container network.podSecurityContext
see belowshareProcessNamespace
- this is a beta feature, and depends on Linux namespaces which are not implemented on Windows. Windows cannot share process namespaces or the container’s root filesystem. Only the network can be shared.terminationGracePeriodSeconds
- this is not fully implemented in Docker on Windows, see the GitHub issue. The behavior today is that the ENTRYPOINT process is sent CTRL_SHUTDOWN_EVENT, then Windows waits 5 seconds by default, and finally shuts down all processes using the normal Windows shutdown behavior. The 5 second default is actually in the Windows registry inside the container, so it can be overridden when the container is built.volumeDevices
- this is a beta feature, and is not implemented on Windows. Windows cannot attach raw block devices to pods.volumes
- If you define an
emptyDir
volume, you cannot set its volume source tomemory
.
- If you define an
- You cannot enable
mountPropagation
for volume mounts as this is not supported on Windows.
Field compatibility for hostNetwork
FEATURE STATE: Kubernetes v1.26 [alpha]
The kubelet can now request that pods running on Windows nodes use the host’s network namespace instead of creating a new pod network namespace. To enable this functionality pass --feature-gates=WindowsHostNetwork=true
to the kubelet.
Note: This functionality requires a container runtime that supports this functionality.
Field compatibility for Pod security context
None of the Pod securityContext fields work on Windows.
Node problem detector
The node problem detector (see Monitor Node Health) has preliminary support for Windows. For more information, visit the project’s GitHub page.
Pause container
In a Kubernetes Pod, an infrastructure or “pause” container is first created to host the container. In Linux, the cgroups and namespaces that make up a pod need a process to maintain their continued existence; the pause process provides this. Containers that belong to the same pod, including infrastructure and worker containers, share a common network endpoint (same IPv4 and / or IPv6 address, same network port spaces). Kubernetes uses pause containers to allow for worker containers crashing or restarting without losing any of the networking configuration.
Kubernetes maintains a multi-architecture image that includes support for Windows. For Kubernetes v1.26 the recommended pause image is registry.k8s.io/pause:3.6
. The source code is available on GitHub.
Microsoft maintains a different multi-architecture image, with Linux and Windows amd64 support, that you can find as mcr.microsoft.com/oss/kubernetes/pause:3.6
. This image is built from the same source as the Kubernetes maintained image but all of the Windows binaries are authenticode signed by Microsoft. The Kubernetes project recommends using the Microsoft maintained image if you are deploying to a production or production-like environment that requires signed binaries.
Container runtimes
You need to install a container runtime into each node in the cluster so that Pods can run there.
The following container runtimes work with Windows:
Note: This section links to third party projects that provide functionality required by Kubernetes. The Kubernetes project authors aren’t responsible for these projects, which are listed alphabetically. To add a project to this list, read the content guide before submitting a change. More information.
ContainerD
FEATURE STATE: Kubernetes v1.20 [stable]
You can use ContainerD 1.4.0+ as the container runtime for Kubernetes nodes that run Windows.
Learn how to install ContainerD on a Windows node.
Note: There is a known limitation when using GMSA with containerd to access Windows network shares, which requires a kernel patch.
Mirantis Container Runtime
Mirantis Container Runtime (MCR) is available as a container runtime for all Windows Server 2019 and later versions.
See Install MCR on Windows Servers for more information.
Windows OS version compatibility
On Windows nodes, strict compatibility rules apply where the host OS version must match the container base image OS version. Only Windows containers with a container operating system of Windows Server 2019 are fully supported.
For Kubernetes v1.26, operating system compatibility for Windows nodes (and Pods) is as follows:
Windows Server LTSC release
Windows Server 2019
Windows Server 2022
Windows Server SAC release
Windows Server version 20H2
The Kubernetes version-skew policy also applies.
Getting help and troubleshooting
Your main source of help for troubleshooting your Kubernetes cluster should start with the Troubleshooting page.
Some additional, Windows-specific troubleshooting help is included in this section. Logs are an important element of troubleshooting issues in Kubernetes. Make sure to include them any time you seek troubleshooting assistance from other contributors. Follow the instructions in the SIG Windows contributing guide on gathering logs.
Reporting issues and feature requests
If you have what looks like a bug, or you would like to make a feature request, please follow the SIG Windows contributing guide to create a new issue. You should first search the list of issues in case it was reported previously and comment with your experience on the issue and add additional logs. SIG Windows channel on the Kubernetes Slack is also a great avenue to get some initial support and troubleshooting ideas prior to creating a ticket.
Deployment tools
The kubeadm tool helps you to deploy a Kubernetes cluster, providing the control plane to manage the cluster it, and nodes to run your workloads. Adding Windows nodes explains how to deploy Windows nodes to your cluster using kubeadm.
The Kubernetes cluster API project also provides means to automate deployment of Windows nodes.
Windows distribution channels
For a detailed explanation of Windows distribution channels see the Microsoft documentation.
Information on the different Windows Server servicing channels including their support models can be found at Windows Server servicing channels.