CPU Burst

Introduction

CPU Burst is a service level objective (SLO)-aware resource scheduling feature provided by Koordinator. You can use CPU Burst to improve the performance of latency-sensitive applications. CPU scheduling for a container may be throttled by the kernel due to the CPU limit, which downgrades the performance of the application. The koordlet component automatically detects CPU throttling events and automatically adjusts the CPU limit to a proper value. This greatly improves the performance of latency-sensitive applications.

How CPU Burst works

Kubernetes allows you to specify CPU limits, which can be reused based on time-sharing. If you specify a CPU limit for a container, the OS limits the amount of CPU resources that can be used by the container within a specific time period. For example, you set the CPU limit of a container to 2. The OS kernel limits the CPU time slices that the container can use to 200 milliseconds within each 100-millisecond period.

CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. If the CPU utilization of a container reaches the limit within a 100-millisecond period, CPU throttling is enforced by the OS kernel and threads in the container are suspended for the rest of the time period, as shown in the following figure.

image

The following figure shows the thread allocation of a web application container that runs on a node with four vCPUs. The CPU limit of the container is set to 2. The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed until the third 100-millisecond period starts because CPU throttling is enforced somewhere in the second 100-millisecond period. This increases the response time (RT) and causes long-tail latency problems in containers.

image

Upstream Linux kernel >=5.14 and Anolis OS both provide Burstable CFS Controller, namely CPU Burst feature. It allows a container to accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU utilization spikes. This improves performance and reduces the RT of the container.

image

For kernel versions that do not support CPU Burst, koordlet detects CPU throttling events and dynamically adjusts the CPU limit to achieve the same effect as CPU Burst.

For more information about CPU Burst, see the presentation at KubeCon 2021: CPU Burst: Getting Rid of Unnecessary Throttling, Achieving High CPU Utilization and Application Performance at the Same Time.

Setup

Prerequisite

  • Kubernetes >= 1.18
  • Koordinator >= 0.3

Installation

Please make sure Koordinator components are correctly installed in your cluster. If not, please refer to Installation.

Configurations

Koordlet has already enabled CPU Burst feature (-feature-gates=AllAlpha=true). If not, please enable it manually by updating the feature gate in the koordlet daemonset.

NOTE: CPU Burst is not available for LSR and BE pods since it targets on burstable cpu usages.

  1. apiVersion: apps/v1
  2. kind: DaemonSet
  3. metadata:
  4. name: koordlet
  5. spec:
  6. selector:
  7. matchLabels:
  8. koord-app: koordlet
  9. template:
  10. metadata:
  11. labels:
  12. koord-app: koordlet
  13. spec:
  14. containers:
  15. - command:
  16. - /koordlet
  17. args:
  18. - -CgroupRootDir=/host-cgroup/
  19. - -feature-gates=XXXX,CPUBurst=true # enable CPU Burst feature
  20. ...

Use CPU Burst

Use an annotation to enable CPU Burst for the pod

Add the following annotation to the pod configuration to enable CPU Burst:

  1. apiVersion: apps/v1
  2. kind: Pod
  3. metadata:
  4. name: demo-pod-xxx
  5. annotations:
  6. # Set the value to auto to enable CPU Burst for the pod.
  7. koordinator.sh/cpuBurst: '{"policy": "auto"}'
  8. # To disable CPU Burst for the pod, set the value to none.
  9. #koordinator.sh/cpuBurst: '{"policy": "none"}'

Use a ConfigMap to enable CPU Burst for all pods in a cluster

Modify the slo-controller-config ConfigMap based on the following content to enable CPU Burst for all pods in a cluster:

  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: slo-controller-config
  5. namespace: koordinator-system
  6. data:
  7. cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}'
  8. #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}'
  9. #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}'

(Optional) Advanced Settings

The following code block shows the pod annotations and ConfigMap fields that you can use for advanced configurations:

  1. # Example of the slo-controller-config ConfigMap.
  2. data:
  3. cpu-burst-config: |
  4. {
  5. "clusterStrategy": {
  6. "policy": "auto",
  7. "cpuBurstPercent": 1000,
  8. "cfsQuotaBurstPercent": 300,
  9. "sharePoolThresholdPercent": 50,
  10. "cfsQuotaBurstPeriodSeconds": -1
  11. }
  12. }
  13. # Example of pod annotations.
  14. koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'

The following table describes the ConfigMap fields that you can use for advanced configurations of CPU Burst.

FieldData typeDescription
policystring
  • none: disables CPU Burst. If you set the value to none, the related fields are reset to their original values. This is the default value.
  • cpuBurstOnly: enables the CPU Burst feature only for the kernel of Anolis OS or upstream linux kernel >= 5.14.
  • cfsQuotaBurstOnly: enables automatic adjustment of CFS quotas of general kernel versions.
  • auto: enables CPU Burst and all the related features.
cpuBurstPercentintDefault value:1000. Unit: %. This field specifies the percentage to which the CPU limit can be increased by CPU Burst. If the CPU limit is set to 1, CPU Burst can increase the limit to 10 by default.
cfsQuotaBurstPercentintDefault value:300. Unit: %. This field specifies the maximum percentage to which the value of cfs_quota in the cgroup parameters can be increased. By default, the value of cfs_quota can be increased to at most three times.
cfsQuotaBurstPeriodSecondsintDefault value:-1. Unit: seconds. This indicates that the time period in which the container can run with an increased CFS quota is unlimited. This field specifies the time period in which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by cfsQuotaBurstPercent.
sharePoolThresholdPercentintDefault value:50. Unit: %. This field specifies the CPU utilization threshold of the node. If the CPU utilization of the node exceeds the threshold, the value of cfs_quota in cgroup parameters is reset to the original value.

Verify CPU Burst

  1. Use the following YAML template to create an apache-demo.yaml file.

To enable CPU Burst for a pod, specify an annotation in the annotations parameter of the metadata section of the pod configuration.

  1. apiVersion: v1
  2. kind: Pod
  3. metadata:
  4. name: apache-demo
  5. annotations:
  6. koordinator.sh/cpuBurst: '{"policy": "auto"}' # Use this annotation to enable or disable CPU Burst.
  7. spec:
  8. containers:
  9. - command:
  10. - httpd
  11. - -D
  12. - FOREGROUND
  13. image: koordinatorsh/apache-2-4-51-for-slo-test:v0.1
  14. imagePullPolicy: Always
  15. name: apache
  16. resources:
  17. limits:
  18. cpu: "4"
  19. memory: 10Gi
  20. requests:
  21. cpu: "4"
  22. memory: 10Gi
  23. nodeName: # $nodeName Set the value to the name of the node that you use.
  24. hostNetwork: False
  25. restartPolicy: Never
  26. schedulerName: default-scheduler
  1. Run the following command to create an application by using Apache HTTP Server.
  1. kubectl apply -f apache-demo.yaml
  1. Use the wrk2 tool to perform stress tests.
  1. # Download, decompress, and then install the wrk2 package.
  2. # The Gzip module is enabled in the configuration of the Apache application. The Gzip module is used to simulate the logic of processing requests on the server.
  3. # Run the following command to send requests. Replace the IP address in the command with the IP address of the application.
  4. ./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 http://$target_ip_address:8010/static/file.1m.test
  1. Check the results of CPU Burst enabled and disabled.

e.g. We may have the following results:

CentOS 7DisabledEnabled
apache RT-p99111.69 ms71.30 ms (-36.2%)
CPU Throttled Ratio33%0%
Average pod CPU utilization32.5%33.8%

The preceding metrics indicate the following information:

  • After CPU Burst is enabled, the P99 latency of apache is greatly reduced.
  • After CPU Burst is enabled, CPU throttling is stopped and the average pod CPU utilization remains approximately at the same value.