Scheduling Windows container workloads

You can schedule Windows workloads to Windows compute nodes.

Prerequisites

  • You installed the Windows Machine Config Operator (WMCO) using Operator Lifecycle Manager (OLM).

  • You are using a Windows container as the OS image.

  • You have created a Windows compute machine set.

Windows pod placement

Before deploying your Windows workloads to the cluster, you must configure your Windows node scheduling so pods are assigned correctly. Since you have a machine hosting your Windows node, it is managed the same as a Linux-based node. Likewise, scheduling a Windows pod to the appropriate Windows node is completed similarly, using mechanisms like taints, tolerations, and node selectors.

With multiple operating systems, and the ability to run multiple Windows OS variants in the same cluster, you must map your Windows pods to a base Windows OS variant by using a RuntimeClass object. For example, if you have multiple Windows nodes running on different Windows Server container versions, the cluster could schedule your Windows pods to an incompatible Windows OS variant. You must have RuntimeClass objects configured for each Windows OS variant on your cluster. Using a RuntimeClass object is also recommended if you have only one Windows OS variant available in your cluster.

For more information, see Microsoft’s documentation on Host and container version compatibility.

Also, it is recommended that you set the spec.os.name.windows parameter in your workload pods. The Windows Machine Config Operator (WMCO) uses this field to authoritatively identify the pod operating system for validation and is used to enforce Windows-specific pod security context constraints (SCCs). Currently, this parameter has no effect on pod scheduling. For more information about this parameter, see the Kubernetes Pods documentation.

The container base image must be the same Windows OS version and build number that is running on the node where the conainer is to be scheduled.

Also, if you upgrade the Windows nodes from one version to another, for example going from 20H2 to 2022, you must upgrade your container base image to match the new version. For more information, see Windows container version compatibility.

Additional resources

Creating a RuntimeClass object to encapsulate scheduling mechanisms

Using a RuntimeClass object simplifies the use of scheduling mechanisms like taints and tolerations; you deploy a runtime class that encapsulates your taints and tolerations and then apply it to your pods to schedule them to the appropriate node. Creating a runtime class is also necessary in clusters that support multiple operating system variants.

Procedure

  1. Create a RuntimeClass object YAML file. For example, runtime-class.yaml:

    1. apiVersion: node.k8s.io/v1beta1
    2. kind: RuntimeClass
    3. metadata:
    4. name: <runtime_class_name> (1)
    5. handler: 'runhcs-wcow-process'
    6. scheduling:
    7. nodeSelector: (2)
    8. kubernetes.io/os: 'windows'
    9. kubernetes.io/arch: 'amd64'
    10. node.kubernetes.io/windows-build: '10.0.17763'
    11. tolerations: (3)
    12. - effect: NoSchedule
    13. key: os
    14. operator: Equal
    15. value: "Windows"
    1Specify the RuntimeClass object name, which is defined in the pods you want to be managed by this runtime class.
    2Specify labels that must be present on nodes that support this runtime class. Pods using this runtime class can only be scheduled to a node matched by this selector. The node selector of the runtime class is merged with the existing node selector of the pod. Any conflicts prevent the pod from being scheduled to the node.
    3Specify tolerations to append to pods, excluding duplicates, running with this runtime class during admission. This combines the set of nodes tolerated by the pod and the runtime class.
  2. Create the RuntimeClass object:

    1. $ oc create -f <file-name>.yaml

    For example:

    1. $ oc create -f runtime-class.yaml
  3. Apply the RuntimeClass object to your pod to ensure it is scheduled to the appropriate operating system variant:

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: my-windows-pod
    5. spec:
    6. runtimeClassName: <runtime_class_name> (1)
    7. ...
    1Specify the runtime class to manage the scheduling of your pod.

Sample Windows container workload deployment

You can deploy Windows container workloads to your cluster once you have a Windows compute node available.

This sample deployment is provided for reference only.

Example Service object

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. name: win-webserver
  5. labels:
  6. app: win-webserver
  7. spec:
  8. ports:
  9. # the port that this service should serve on
  10. - port: 80
  11. targetPort: 80
  12. selector:
  13. app: win-webserver
  14. type: LoadBalancer

Example Deployment object

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. labels:
  5. app: win-webserver
  6. name: win-webserver
  7. spec:
  8. selector:
  9. matchLabels:
  10. app: win-webserver
  11. replicas: 1
  12. template:
  13. metadata:
  14. labels:
  15. app: win-webserver
  16. name: win-webserver
  17. spec:
  18. tolerations:
  19. - key: "os"
  20. value: "Windows"
  21. Effect: "NoSchedule"
  22. containers:
  23. - name: windowswebserver
  24. image: mcr.microsoft.com/windows/servercore:ltsc2019
  25. imagePullPolicy: IfNotPresent
  26. command:
  27. - powershell.exe
  28. - -command
  29. - $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Red Hat OpenShift + Windows Container Workloads</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
  30. securityContext:
  31. runAsNonRoot: false
  32. windowsOptions:
  33. runAsUserName: "ContainerAdministrator"
  34. nodeSelector:
  35. kubernetes.io/os: windows
  36. os:
  37. name: windows

When using the mcr.microsoft.com/powershell:<tag> container image, you must define the command as pwsh.exe. If you are using the mcr.microsoft.com/windows/servercore:<tag> container image, you must define the command as powershell.exe. For more information, see Microsoft’s documentation.

Scaling a compute machine set manually

To add or remove an instance of a machine in a compute machine set, you can manually scale the compute machine set.

This guidance is relevant to fully automated, installer-provisioned infrastructure installations. Customized, user-provisioned infrastructure installations do not have compute machine sets.

Prerequisites

  • Install an OKD cluster and the oc command line.

  • Log in to oc as a user with cluster-admin permission.

Procedure

  1. View the compute machine sets that are in the cluster by running the following command:

    1. $ oc get machinesets -n openshift-machine-api

    The compute machine sets are listed in the form of <clusterid>-worker-<aws-region-az>.

  2. View the compute machines that are in the cluster by running the following command:

    1. $ oc get machine -n openshift-machine-api
  3. Set the annotation on the compute machine that you want to delete by running the following command:

    1. $ oc annotate machine/<machine_name> -n openshift-machine-api machine.openshift.io/delete-machine="true"
  4. Scale the compute machine set by running one of the following commands:

    1. $ oc scale --replicas=2 machineset <machineset> -n openshift-machine-api

    Or:

    1. $ oc edit machineset <machineset> -n openshift-machine-api

    You can alternatively apply the following YAML to scale the compute machine set:

    1. apiVersion: machine.openshift.io/v1beta1
    2. kind: MachineSet
    3. metadata:
    4. name: <machineset>
    5. namespace: openshift-machine-api
    6. spec:
    7. replicas: 2

    You can scale the compute machine set up or down. It takes several minutes for the new machines to be available.

    By default, the machine controller tries to drain the node that is backed by the machine until it succeeds. In some situations, such as with a misconfigured pod disruption budget, the drain operation might not be able to succeed. If the drain operation fails, the machine controller cannot proceed removing the machine.

    You can skip draining the node by annotating machine.openshift.io/exclude-node-draining in a specific machine.

Verification

  • Verify the deletion of the intended machine by running the following command:

    1. $ oc get machines