Setting a node to maintenance mode

Place a node into maintenance from the web console, CLI, or using a NodeMaintenance custom resource.

Setting a node to maintenance mode in the web console

Set a node to maintenance mode using the Options menu kebab found on each node in the ComputeNodes list, or using the Actions control of the Node Details screen.

Procedure

  1. In the OKD Virtualization console, click ComputeNodes.

  2. You can set the node to maintenance from this screen, which makes it easier to perform actions on multiple nodes in the one screen or from the Node Details screen where you can view comprehensive details of the selected node:

    • Click the Options menu kebab at the end of the node and select Start Maintenance.

    • Click the node name to open the Node Details screen and click ActionsStart Maintenance.

  3. Click Start Maintenance in the confirmation window.

The node will live migrate virtual machine instances that have the LiveMigration eviction strategy, and the node is no longer schedulable. All other pods and virtual machines on the node are deleted and recreated on another node.

Setting a node to maintenance mode in the CLI

Set a node to maintenance mode by marking it as unschedulable and using the oc adm drain command to evict or delete pods from the node.

Procedure

  1. Mark the node as unschedulable. The node status changes to NotReady,SchedulingDisabled.

    1. $ oc adm cordon <node1>
  2. Drain the node in preparation for maintenance. The node live migrates virtual machine instances that have the LiveMigratable condition set to True and the spec:evictionStrategy field set to LiveMigrate. All other pods and virtual machines on the node are deleted and recreated on another node.

    1. $ oc adm drain <node1> --delete-local-data --ignore-daemonsets=true --force
    • The --delete-local-data flag removes any virtual machine instances on the node that use emptyDir volumes. Data in these volumes is ephemeral and is safe to be deleted after termination.

    • The --ignore-daemonsets=true flag ensures that daemon sets are ignored and pod eviction can continue successfully.

    • The --force flag is required to delete pods that are not managed by a replica set or daemon set controller.

Setting a node to maintenance mode with a NodeMaintenance custom resource

You can put a node into maintenance mode with a NodeMaintenance custom resource (CR). When you apply a NodeMaintenance CR, all allowed pods are evicted and the node is shut down. Evicted pods are queued to be moved to another node in the cluster.

Prerequisites

  • Install the OKD CLI oc.

  • Log in to the cluster as a user with cluster-admin privileges.

Procedure

  1. Create the following node maintenance CR, and save the file as nodemaintenance-cr.yaml:

    1. apiVersion: nodemaintenance.kubevirt.io/v1beta1
    2. kind: NodeMaintenance
    3. metadata:
    4. name: maintenance-example (1)
    5. spec:
    6. nodeName: node-1.example.com (2)
    7. reason: "Node maintenance" (3)
    1Node maintenance CR name
    2The name of the node to be put into maintenance mode
    3Plain text description of the reason for maintenance
  2. Apply the node maintenance schedule by running the following command:

    1. $ oc apply -f nodemaintenance-cr.yaml
  3. Check the progress of the maintenance task by running the following command, replacing <node-name> with the name of your node:

    1. $ oc describe node <node-name>

    Example output

    1. Events:
    2. Type Reason Age From Message
    3. ---- ------ ---- ---- -------
    4. Normal NodeNotSchedulable 61m kubelet Node node-1.example.com status is now: NodeNotSchedulable

Checking status of current NodeMaintenance CR tasks

You can check the status of current NodeMaintenance CR tasks.

Prerequisites

  • Install the OKD CLI oc.

  • Log in as a user with cluster-admin privileges.

Procedure

  • Check the status of current node maintenance tasks by running the following command:

    1. $ oc get NodeMaintenance -o yaml

    Example output

    1. apiVersion: v1
    2. items:
    3. - apiVersion: nodemaintenance.kubevirt.io/v1beta1
    4. kind: NodeMaintenance
    5. metadata:
    6. ...
    7. spec:
    8. nodeName: node-1.example.com
    9. reason: Node maintenance
    10. status:
    11. evictionPods: 3 (1)
    12. pendingPods:
    13. - pod-example-workload-0
    14. - httpd
    15. - httpd-manual
    16. phase: Running
    17. lastError: "Last failure message" (2)
    18. totalpods: 5
    19. ...
    1evictionPods is the number of pods scheduled for eviction.
    2lastError records the latest eviction error, if any.

Additional resources: