PodUnavailableBudget

FEATURE STATE: Kruise v0.10.0

Kubernetes offers Pod Disruption Budget to help you run highly available applications even when you introduce frequent voluntary disruptions. PDB limits the number of Pods of a replicated application that are down simultaneously from voluntary disruptions. However, it can only constrain the voluntary disruption triggered by the Eviction API. For example, when you run kubectl drain, the tool tries to evict all of the Pods on the Node you’re taking out of service.

In the following voluntary disruption scenarios, there are still business disruption or SLA degradation situations:

  1. The application owner update deployment’s pod template for general upgrading, while cluster administrator drain nodes to scale the cluster down(learn about Cluster Autoscaling).
  2. The middleware team is using SidecarSet to rolling upgrade the sidecar containers of the cluster, e.g. ServiceMesh envoy, while HPA triggers the scale-down of business applications.
  3. The application owner and middleware team release the same Pods at the same time based on OpenKruise cloneSet, sidecarSet in-place upgrades

In voluntary disruption scenarios, PodUnavailableBudget can achieve the effect of preventing application disruption or SLA degradation, which greatly improves the high availability of application services.

API Definition

  1. apiVersion: policy.kruise.io/v1alpha1
  2. kind: PodUnavailableBudget
  3. metadata:
  4. name: web-server-pub
  5. namespace: web
  6. spec:
  7. targetRef:
  8. apiVersion: apps.kruise.io/v1alpha1
  9. # cloneset, deployment, statefulset etc.
  10. kind: CloneSet
  11. name: web-server
  12. # selector label query over pods managed by the budget
  13. # selector and TargetReference are mutually exclusive, targetRef is priority to take effect.
  14. # selector is commonly used in scenarios where applications are deployed using multiple workloads,
  15. # and targetRef is used for protection against a single workload.
  16. # selector:
  17. # matchLabels:
  18. # app: web-server
  19. # maximum number of Pods unavailable for the current cloneset, the example is cloneset.replicas(5) * 60% = 3
  20. # maxUnavailable and minAvailable are mutually exclusive, maxUnavailable is priority to take effect
  21. maxUnavailable: 60%
  22. # Minimum number of Pods available for the current cloneset, the example is cloneset.replicas(5) * 40% = 2
  23. # minAvailable: 40%
  24. -----------------------
  25. apiVersion: apps.kruise.io/v1alpha1
  26. kind: CloneSet
  27. metadata:
  28. labels:
  29. app: web-server
  30. name: web-server
  31. namespace: web
  32. spec:
  33. replicas: 5
  34. selector:
  35. matchLabels:
  36. app: web-server
  37. template:
  38. metadata:
  39. labels:
  40. app: web-server
  41. spec:
  42. containers:
  43. - name: nginx
  44. image: nginx:alpine

Support Custom Workload

FEATURE STATE: Kruise v1.2.0

Many companies to meet the needs of more complex application deployment, often through the implementation of custom Workload to manage business Pod. From kruise v1.2.0, PodUnavailableBudget(PUB) support protect any custom workload with scale sub-resource, e.g. Argo-Rollout:

  1. apiVersion: policy.kruise.io/v1alpha1
  2. kind: PodUnavailableBudget
  3. metadata:
  4. name: rollouts-demo
  5. spec:
  6. targetRef:
  7. apiVersion: argoproj.io/v1alpha1
  8. kind: Rollout
  9. name: rollouts-demo
  10. minAvailable: 80%

Implementation

This program customizes the PodUnavailableBudget (later referred to as PUB) CRD resource to describe the desired state of the application, and the working mechanism is shown below:

PodUnavailableBudget

Comparison with Kubernetes native PDB

Kubernetes PodDisruptionBudget implements protection against Pod Eviction based on the EvictionREST interface, while PodUnavailableBudget intercepts all pod modification requests through the admission webhook validating mechanism (Many voluntary disruption scenarios can be summarized as modifications to Pod resources), and reject the request if the modification does not satisfy the desired state of the PUB.

Pub contains all the protection capabilities of kubernetes PDB, you can use both, or use pub independently to implement your application protection (Recommend).

feature-gates

PodUnavailableBudget protection against Pods is turned off by default, if you want to turn it on set feature-gates PodUnavailableBudgetDeleteGate and PodUnavailableBudgetUpdateGate.

  1. $ helm install kruise https://... --set featureGates="PodUnavailableBudgetDeleteGate=true\,PodUnavailableBudgetUpdateGate=true"

PodUnavailableBudget Status

  1. # kubectl describe podunavailablebudgets web-server-pub
  2. Name: web-server-pub
  3. Kind: PodUnavailableBudget
  4. Status:
  5. unavailableAllowed: 3 # unavailableAllowed number of pod unavailable that are currently allowed
  6. currentAvailable: 5 # currentAvailable current number of available pods
  7. desiredAvailable: 2 # desiredAvailable minimum desired number of available pods
  8. totalReplicas: 5 # totalReplicas total number of pods counted by this PUB