Running a custom scheduler

You can run multiple custom schedulers alongside the default scheduler and configure which scheduler to use for each pod.

It is supported to use a custom scheduler with OKD, but Red Hat does not directly support the functionality of the custom scheduler.

For information on how to configure the default scheduler, see Controlling pod placement using the scheduler.

To schedule a given pod using a specific scheduler, specify the name of the scheduler in that Pod specification.

Deploying a custom scheduler

To include a custom scheduler in your cluster, include the image for a custom scheduler in a deployment.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

  • You have a scheduler binary.

    Information on how to create a scheduler binary is outside the scope of this document. For an example, see Configure Multiple Schedulers in the Kubernetes documentation. Note that the actual functionality of your custom scheduler is not supported by Red Hat.

  • You have created an image containing the scheduler binary and pushed it to a registry.

Procedure

  1. Create a file that contains the deployment resources for the custom scheduler:

    Example custom-scheduler.yaml file

    1. apiVersion: v1
    2. kind: ServiceAccount
    3. metadata:
    4. name: custom-scheduler
    5. namespace: kube-system (1)
    6. ---
    7. apiVersion: rbac.authorization.k8s.io/v1
    8. kind: ClusterRoleBinding
    9. metadata:
    10. name: custom-scheduler-as-kube-scheduler
    11. subjects:
    12. - kind: ServiceAccount
    13. name: custom-scheduler
    14. namespace: kube-system (1)
    15. roleRef:
    16. kind: ClusterRole
    17. name: system:kube-scheduler
    18. apiGroup: rbac.authorization.k8s.io
    19. ---
    20. apiVersion: rbac.authorization.k8s.io/v1
    21. kind: ClusterRoleBinding
    22. metadata:
    23. name: custom-scheduler-as-volume-scheduler
    24. subjects:
    25. - kind: ServiceAccount
    26. name: custom-scheduler
    27. namespace: kube-system (1)
    28. roleRef:
    29. kind: ClusterRole
    30. name: system:volume-scheduler
    31. apiGroup: rbac.authorization.k8s.io
    32. ---
    33. apiVersion: apps/v1
    34. kind: Deployment
    35. metadata:
    36. labels:
    37. component: scheduler
    38. tier: control-plane
    39. name: custom-scheduler
    40. namespace: kube-system (1)
    41. spec:
    42. selector:
    43. matchLabels:
    44. component: scheduler
    45. tier: control-plane
    46. replicas: 1
    47. template:
    48. metadata:
    49. labels:
    50. component: scheduler
    51. tier: control-plane
    52. version: second
    53. spec:
    54. serviceAccountName: custom-scheduler
    55. containers:
    56. - command:
    57. - /usr/local/bin/kube-scheduler
    58. - --address=0.0.0.0
    59. - --leader-elect=false
    60. - --scheduler-name=custom-scheduler (2)
    61. image: "<namespace>/<image_name>:<tag>" (3)
    62. livenessProbe:
    63. httpGet:
    64. path: /healthz
    65. port: 10251
    66. initialDelaySeconds: 15
    67. name: kube-second-scheduler
    68. readinessProbe:
    69. httpGet:
    70. path: /healthz
    71. port: 10251
    72. resources:
    73. requests:
    74. cpu: '0.1'
    75. securityContext:
    76. privileged: false
    77. volumeMounts: []
    78. hostNetwork: false
    79. hostPID: false
    80. volumes: []
    1This procedure uses the kube-system namespace, but you can use the namespace of your choosing.
    2The command for your custom scheduler might require different arguments. For example, you can pass configuration as a mounted volume using the —config argument.
    3Specify the container image that you created for the custom scheduler.
  2. Create the deployment resources in the cluster:

    1. $ oc create -f custom-scheduler.yaml

Verification

  • Verify that the scheduler pod is running:

    1. $ oc get pods -n kube-system

    The custom scheduler pod is listed as Running:

    1. NAME READY STATUS RESTARTS AGE
    2. custom-scheduler-6cd7c4b8bc-854zb 1/1 Running 0 2m

Deploying pods using a custom scheduler

After the custom scheduler is deployed in your cluster, you can configure pods to use that scheduler instead of the default scheduler.

Each scheduler has a separate view of resources in a cluster. For that reason, each scheduler should operate over its own set of nodes.

If two or more schedulers operate on the same node, they might intervene with each other and schedule more pods on the same node than there are available resources for. Pods might get rejected due to insufficient resources in this case.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

  • The custom scheduler has been deployed in the cluster.

Procedure

  1. If your cluster uses role-based access control (RBAC), add the custom scheduler name to the system:kube-scheduler cluster role.

    1. Edit the system:kube-scheduler cluster role:

      1. $ oc edit clusterrole system:kube-scheduler
    2. Add the name of the custom scheduler to the resourceNames lists for the leases and endpoints resources:

      1. apiVersion: rbac.authorization.k8s.io/v1
      2. kind: ClusterRole
      3. metadata:
      4. annotations:
      5. rbac.authorization.kubernetes.io/autoupdate: "true"
      6. creationTimestamp: "2021-07-07T10:19:14Z"
      7. labels:
      8. kubernetes.io/bootstrapping: rbac-defaults
      9. name: system:kube-scheduler
      10. resourceVersion: "125"
      11. uid: 53896c70-b332-420a-b2a4-f72c822313f2
      12. rules:
      13. ...
      14. - apiGroups:
      15. - coordination.k8s.io
      16. resources:
      17. - leases
      18. verbs:
      19. - create
      20. - apiGroups:
      21. - coordination.k8s.io
      22. resourceNames:
      23. - kube-scheduler
      24. - custom-scheduler (1)
      25. resources:
      26. - leases
      27. verbs:
      28. - get
      29. - update
      30. - apiGroups:
      31. - ""
      32. resources:
      33. - endpoints
      34. verbs:
      35. - create
      36. - apiGroups:
      37. - ""
      38. resourceNames:
      39. - kube-scheduler
      40. - custom-scheduler (1)
      41. resources:
      42. - endpoints
      43. verbs:
      44. - get
      45. - update
      46. ...
      1This example uses custom-scheduler as the custom scheduler name.
  2. Create a Pod configuration and specify the name of the custom scheduler in the schedulerName parameter:

    Example custom-scheduler-example.yaml file

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. name: custom-scheduler-example
    5. labels:
    6. name: custom-scheduler-example
    7. spec:
    8. schedulerName: custom-scheduler (1)
    9. containers:
    10. - name: pod-with-second-annotation-container
    11. image: docker.io/ocpqe/hello-pod
    1The name of the custom scheduler to use, which is custom-scheduler in this example. When no scheduler name is supplied, the pod is automatically scheduled using the default scheduler.
  3. Create the pod:

    1. $ oc create -f custom-scheduler-example.yaml

Verification

  1. Enter the following command to check that the pod was created:

    1. $ oc get pod custom-scheduler-example

    The custom-scheduler-example pod is listed in the output:

    1. NAME READY STATUS RESTARTS AGE
    2. custom-scheduler-example 1/1 Running 0 4m
  2. Enter the following command to check that the custom scheduler has scheduled the pod:

    1. $ oc describe pod custom-scheduler-example

    The scheduler, custom-scheduler, is listed as shown in the following truncated output:

    1. Events:
    2. Type Reason Age From Message
    3. ---- ------ ---- ---- -------
    4. Normal Scheduled <unknown> custom-scheduler Successfully assigned default/custom-scheduler-example to <node_name>

Additional resources