Advanced Scheduling and Node Affinity
Overview
Node affinity is a set of rules used by the scheduler to determine where a pod can be placed. The rules are defined using custom labels on nodes and label selectors specified in pods. Node affinity allows a pod to specify an affinity (or anti-affinity) towards a group of nodes it can be placed on. The node does not have control over the placement.
For example, you could configure a pod to only run on a node with a specific CPU or in a specific availability zone.
There are two types of node affinity rules: required and preferred.
Required rules must be met before a pod can be scheduled on a node. Preferred rules specify that, if the rule is met, the scheduler tries to enforce the rules, but does not guarantee enforcement.
If labels on a node change at runtime that results in an node affinity rule on a pod no longer being met, the pod continues to run on the node. |
Configuring Node Affinity
You configure node affinity through the pod specification file. You can specify a required rule, a preferred rule, or both. If you specify both, the node must first meet the required rule, then attempts to meet the preferred rule.
The following example is a pod specification with a rule that requires the pod be placed on a node with a label whose key is e2e-az-NorthSouth
and whose value is either e2e-az-North
or e2e-az-South
:
Sample pod configuration file with a node affinity required rule
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity: (1)
requiredDuringSchedulingIgnoredDuringExecution: (2)
nodeSelectorTerms:
- matchExpressions:
- key: e2e-az-NorthSouth (3)
operator: In (4)
values:
- e2e-az-North (3)
- e2e-az-South (3)
containers:
- name: with-node-affinity
image: docker.io/ocpqe/hello-pod
1 | The stanza to configure node affinity. |
2 | Defines a required rule. |
3 | The key/value pair (label) that must be matched to apply the rule. |
4 | The operator represents the relationship between the label on the node and the set of values in the matchExpression parameters in the pod specification. This value can be In , NotIn , Exists , or DoesNotExist , Lt , or Gt . |
The following example is a node specification with a preferred rule that a node with a label whose key is e2e-az-EastWest
and whose value is either e2e-az-East
or e2e-az-West
is preferred for the pod:
Sample pod configuration file with a node affinity preferred rule
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
affinity:
nodeAffinity: (1)
preferredDuringSchedulingIgnoredDuringExecution: (2)
- weight: 1 (3)
preference:
matchExpressions:
- key: e2e-az-EastWest (4)
operator: In (5)
values:
- e2e-az-East (4)
- e2e-az-West (4)
containers:
- name: with-node-affinity
image: docker.io/ocpqe/hello-pod
1 | The stanza to configure node affinity. |
2 | Defines a preferred rule. |
3 | Specifies a weight for a preferred rule. The node with highest weight is preferred. |
4 | The key/value pair (label) that must be matched to apply the rule. |
5 | The operator represents the relationship between the label on the node and the set of values in the matchExpression parameters in the pod specification. This value can be In , NotIn , Exists , or DoesNotExist , Lt , or Gt . |
There is no explicit node anti-affinity concept, but using the NotIn
or DoesNotExist
operator replicates that behavior.
If you are using node affinity and node selectors in the same pod configuration, note the following:
|
Configuring a Required Node Affinity Rule
Required rules must be met before a pod can be scheduled on a node.
The following steps demonstrate a simple configuration that creates a node and a pod that the scheduler is required to place on the node.
Add a label to a node by editing the node configuration or by using the
oc label node
command:$ oc label node node1 e2e-az-name=e2e-az1
To modify a node in your cluster, update the node configuration maps as needed. Do not manually edit the
node-config.yaml
file.In the pod specification, use the
nodeAffinity
stanza to configure therequiredDuringSchedulingIgnoredDuringExecution
parameter:Specify the key and values that must be met. If you want the new pod to be scheduled on the node you edited, use the same
key
andvalue
parameters as the label in the node.Specify an
operator
. The operator can beIn
,NotIn
,Exists
,DoesNotExist
,Lt
, orGt
. For example, use the operatorIn
to require the label to be in the node:spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
Create the pod:
$ oc create -f e2e-az2.yaml
Configuring a Preferred Node Affinity Rule
Preferred rules specify that, if the rule is met, the scheduler tries to enforce the rules, but does not guarantee enforcement.
The following steps demonstrate a simple configuration that creates a node and a pod that the scheduler tries to place on the node.
Add a label to a node by editing the node configuration or by executing the
oc label node
command:$ oc label node node1 e2e-az-name=e2e-az3
To modify a node in your cluster, update the node configuration maps as needed. Do not manually edit the
node-config.yaml
file.In the pod specification, use the
nodeAffinity
stanza to configure thepreferredDuringSchedulingIgnoredDuringExecution
parameter:Specify a weight for the node, as a number 1-100. The node with highest weight is preferred.
Specify the key and values that must be met. If you want the new pod to be scheduled on the node you edited, use the same
key
andvalue
parameters as the label in the node:preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: e2e-az-name
operator: In
values:
- e2e-az3
Specify an
operator
. The operator can beIn
,NotIn
,Exists
,DoesNotExist
,Lt
, orGt
. For example, use the operatorIn
to require the label to be in the node.Create the pod.
$ oc create -f e2e-az3.yaml
Examples
The following examples demonstrate node affinity.
Node Affinity with Matching Labels
The following example demonstrates node affinity for a node and pod with matching labels:
The Node1 node has the label
zone:us
:$ oc label node node1 zone=us
The pod pod-s1 has the
zone
andus
key/value pair under a required node affinity rule:$ cat pod-s1.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-s1
spec:
containers:
- image: "docker.io/ocpqe/hello-pod"
name: hello-pod
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "zone"
operator: In
values:
- us
Create the pod using the standard command:
$ oc create -f pod-s1.yaml
pod "pod-s1" created
The pod pod-s1 can be scheduled on Node1:
oc get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
pod-s1 1/1 Running 0 4m IP1 node1
Node Affinity with No Matching Labels
The following example demonstrates node affinity for a node and pod without matching labels:
The Node1 node has the label
zone:emea
:$ oc label node node1 zone=emea
The pod pod-s1 has the
zone
andus
key/value pair under a required node affinity rule:$ cat pod-s1.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-s1
spec:
containers:
- image: "docker.io/ocpqe/hello-pod"
name: hello-pod
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "zone"
operator: In
values:
- us
The pod pod-s1 cannot be scheduled on Node1:
oc describe pod pod-s1
<---snip--->
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason
--------- -------- ----- ---- ------------- -------- ------
1m 33s 8 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (1).