Workload Rebalancer

Objectives

In general case, after replicas of workloads is scheduled, it will keep the scheduling result inert and the replicas distribution will not change. Now, assuming in some special scenario you want to actively trigger a fresh rescheduling, you can achieve it by Workload Rebalancer.

So, this section will guide you to cover how to use Workload Rebalancer to trigger a rescheduling.

Prerequisites

Karmada with multi cluster has been installed

Run the command:

  1. git clone https://github.com/karmada-io/karmada
  2. cd karmada
  3. hack/local-up-karmada.sh
  4. export KUBECONFIG=~/.kube/karmada.config:~/.kube/members.config

Note:

Before guide started, we should install at least three kubernetes clusters, one is for Karmada control plane, the other two for member clusters. For convenience, we use hack/local-up-karmada.sh script to quickly prepare the above clusters.

After the above command executed, you will see Karmada control plane installed with multi member clusters.

Tutorial

Step 1: create a Deployment

First prepare a Deployment named foo, you can create a new file deployment.yaml and content with the following:

deployment.yaml

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: foo
  5. labels:
  6. app: test
  7. spec:
  8. replicas: 3
  9. selector:
  10. matchLabels:
  11. app: foo
  12. template:
  13. metadata:
  14. labels:
  15. app: foo
  16. spec:
  17. terminationGracePeriodSeconds: 0
  18. containers:
  19. - image: nginx
  20. name: foo
  21. resources:
  22. limits:
  23. cpu: 10m
  24. memory: 10Mi
  25. ---
  26. apiVersion: policy.karmada.io/v1alpha1
  27. kind: PropagationPolicy
  28. metadata:
  29. name: default-pp
  30. spec:
  31. placement:
  32. clusterTolerations:
  33. - effect: NoExecute
  34. key: workload-rebalancer-test
  35. operator: Exists
  36. tolerationSeconds: 0
  37. clusterAffinity:
  38. clusterNames:
  39. - member1
  40. - member2
  41. replicaScheduling:
  42. replicaDivisionPreference: Weighted
  43. replicaSchedulingType: Divided
  44. weightPreference:
  45. dynamicWeight: AvailableReplicas
  46. resourceSelectors:
  47. - apiVersion: apps/v1
  48. kind: Deployment
  49. name: foo
  50. namespace: default

Then run the following command to create those resources:

  1. kubectl --context karmada-apiserver apply -f deployment.yaml

And you can check whether this step succeed like this:

  1. $ karmadactl --karmada-context karmada-apiserver get deploy foo
  2. NAME CLUSTER READY UP-TO-DATE AVAILABLE AGE ADOPTION
  3. foo member1 2/2 2 2 20s Y
  4. foo member2 1/1 1 1 20s Y

Thus, 2 replicas propagated to member1 cluster and 1 replica propagated to member2 cluster.

Step 2: add NoExecute taint to member1 cluster to mock cluster failover

  • Run the following command to add NoExecute taint to member1 cluster:
  1. $ karmadactl --karmada-context=karmada-apiserver taint clusters member1 workload-rebalancer-test:NoExecute
  2. cluster/member1 tainted

Then, reschedule will be triggered for the reason of cluster failover, and all replicas will be propagated to member2 cluster, you can see:

  1. $ karmadactl --karmada-context karmada-apiserver get deploy foo
  2. NAME CLUSTER READY UP-TO-DATE AVAILABLE AGE ADOPTION
  3. foo member2 3/3 3 3 57s Y
  • Run the following command to remove the above NoExecute taint from member1 cluster:
  1. $ karmadactl --karmada-context=karmada-apiserver taint clusters member1 workload-rebalancer-test:NoExecute-
  2. cluster/member1 untainted

Removing the taint will not lead to replicas propagation changed for the reason of scheduling result inert, all replicas will keep in member2 cluster unchanged.

Step 3. apply a WorkloadRebalancer to trigger rescheduling.

In order to trigger the rescheduling of the above resources, you can create a new file workload-rebalancer.yaml and content with the following:

  1. apiVersion: apps.karmada.io/v1alpha1
  2. kind: WorkloadRebalancer
  3. metadata:
  4. name: demo
  5. spec:
  6. workloads:
  7. - apiVersion: apps/v1
  8. kind: Deployment
  9. name: foo
  10. namespace: default

Then run the following command to apply it:

  1. kubectl --context karmada-apiserver apply -f workload-rebalancer.yaml

you will get a workloadrebalancer.apps.karmada.io/demo created result, which means the API created success.

Step 4: check the status of WorkloadRebalancer.

Run the following command:

  1. $ kubectl --context karmada-apiserver get workloadrebalancer demo -o yaml
  2. apiVersion: apps.karmada.io/v1alpha1
  3. kind: WorkloadRebalancer
  4. metadata:
  5. creationTimestamp: "2024-05-25T09:49:51Z"
  6. generation: 1
  7. name: demo
  8. spec:
  9. workloads:
  10. - apiVersion: apps/v1
  11. kind: Deployment
  12. name: foo
  13. namespace: default
  14. status:
  15. finishTime: "2024-05-25T09:49:51Z"
  16. observedGeneration: 1
  17. observedWorkloads:
  18. - result: Successful
  19. workload:
  20. apiVersion: apps/v1
  21. kind: Deployment
  22. name: foo
  23. namespace: default

Thus, you can observe the rescheduling result at status.observedWorkloads field of workloadrebalancer/demo. As you can see, deployment/foo rescheduled successfully.

Step 5: Observe the real effect of WorkloadRebalancer

You can observe the real replicas propagation status of deployment/foo:

  1. $ karmadactl --karmada-context karmada-apiserver get deploy foo
  2. NAME CLUSTER READY UP-TO-DATE AVAILABLE AGE ADOPTION
  3. foo member1 2/2 2 2 3m14s Y
  4. foo member2 1/1 1 1 4m37s Y

As you see, rescheduling happened and 2 replicas migrated back to member1 cluster while 1 replica in member2 cluster keep unchanged.

Besides, you can observe a schedule event emitted by default-scheduler, such as:

  1. $ kubectl --context karmada-apiserver describe deployment foo
  2. ...
  3. Events:
  4. Type Reason Age From Message
  5. ---- ------ ---- ---- -------
  6. ...
  7. Normal ScheduleBindingSucceed 3m34s (x2 over 4m57s) default-scheduler Binding has been scheduled successfully. Result: {member1:2, member2:1}
  8. Normal AggregateStatusSucceed 3m20s (x20 over 4m57s) resource-binding-status-controller Update resourceBinding(default/foo-deployment) with AggregatedStatus successfully.
  9. ...

Step 6: Update and Auto-clean WorkloadRebalancer

Assuming you want the WorkloadRebalancer resource been auto cleaned in the future, you can just edit it and set spec.ttlSecondsAfterFinished field to 300, just like:

  1. apiVersion: apps.karmada.io/v1alpha1
  2. kind: WorkloadRebalancer
  3. metadata:
  4. name: demo
  5. spec:
  6. ttlSecondsAfterFinished: 300
  7. workloads:
  8. - apiVersion: apps/v1
  9. kind: Deployment
  10. name: foo
  11. namespace: default

After you applied this modification, this WorkloadRebalancer resource will be auto deleted after 300 seconds.