Intelligent Autoscaling Practices Based on Effective HPA for Custom Metrics

Best Practices for Effective HPA.

The Kubernetes HPA supports rich elasticity scaling capabilities, with Kubernetes platform developers deploying services to implement custom Metric services and Kubernetes users configuring multiple built-in resource metrics or custom Metric metrics to achieve custom horizontal elasticity. Effective HPA is compatible with the community’s Kubernetes HPA capabilities, providing smarter autoscaling policies such as prediction-based autoscaling and Cron-cycle-based autoscaling. Prometheus is a popular open source monitoring system today, through which user-defined metrics configurations are accessible.

In this article, we present an example of how to implement intelligent resilience of custom metrics based on Effective HPA. Some configurations are taken from official documentation

Environment Requirements

  • Kubernetes 1.18+
  • Helm 3.1.0
  • Crane v0.6.0+
  • Prometheus

Refer to installation documentation to install Crane in the cluster, Prometheus can be used either from the installation documentation or from the deployed Prometheus.

Environment build

Installing PrometheusAdapter

The Crane components Metric-Adapter and PrometheusAdapter are both based on custom-metric-apiserver which implements When installing Crane, the corresponding ApiService will be installed as the Metric-Adapter of Crane, so you need to remove the ApiService before installing PrometheusAdapter to ensure that Helm is installed successfully.

  1. # View the current ApiService
  2. kubectl get apiservice

Since Crane is installed, the result is as follows.

  1. NAME SERVICE AVAILABLE AGE
  2. v1beta1.batch Local True 35d
  3. v1beta1.custom.metrics.k8s.io crane-system/metric-adapter True 18d
  4. v1beta1.discovery.k8s.io Local True 35d
  5. v1beta1.events.k8s.io Local True 35d
  6. v1beta1.external.metrics.k8s.io crane-system/metric-adapter True 18d
  7. v1beta1.flowcontrol.apiserver.k8s.io Local True 35d
  8. v1beta1.metrics.k8s.io kube-system/metrics-service True 35d

Remove the installed ApiService by crane

  1. kubectl delete apiservice v1beta1.custom.metrics.k8s.io
  2. kubectl delete apiservice v1beta1.external.metrics.k8s.io

Install PrometheusAdapter via Helm

  1. helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
  2. helm repo update
  3. helm install prometheus-adapter -n crane-system prometheus-community/prometheus-adapter

Then change the ApiService back to Crane’s Metric-Adapter

  1. kubectl apply -f https://raw.githubusercontent.com/gocrane/crane/main/deploy/metric-adapter/apiservice.yaml

Configure Metric-Adapter to enable RemoteAdapter functionality

The installation of PrometheusAdapter did not point the ApiService to PrometheusAdapter, so in order to allow PrometheusAdapter to provide custom Metric as well, the RemoteAdapter function of Crane Metric Adapter is used to forward requests to PrometheusAdapter.

Modify the Metric-Adapter configuration to configure PrometheusAdapter’s Service as Crane Metric Adapter’s RemoteAdapter

  1. # View the current ApiService
  2. kubectl edit deploy metric-adapter -n crane-system

Make the following changes based on the PrometheusAdapter configuration.

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: metric-adapter
  5. namespace: crane-system
  6. spec:
  7. template:
  8. spec:
  9. containers:
  10. - args:
  11. #Add external Adapter configuration
  12. - --remote-adapter=true
  13. - --remote-adapter-service-namespace=crane-system
  14. - --remote-adapter-service-name=prometheus-adapter
  15. - --remote-adapter-service-port=443

RemoteAdapter Capabilities

Intelligent Autoscaling Practices Based on Effective HPA for Custom Metrics - 图1

Kubernetes restricts an ApiService to configure only one backend service, so in order to use the Metric provided by Crane and the Metric provided by PrometheusAdapter within a cluster, Crane supports a RemoteAdapter to solve this problem

  • Crane Metric-Adapter supports the configuration of a Kubernetes Service as a Remote Adapter
  • The Crane Metric-Adapter will first check if the request is a Crane provided Local Metric, and if not, forward it to the Remote Adapter

Run the example

Preparing the application

Deploy the following application to the cluster, which exposes the Metric to show the number of http requests received per second.

sample-app.deploy.yaml

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: sample-app
  5. labels:
  6. app: sample-app
  7. spec:
  8. replicas: 1
  9. selector:
  10. matchLabels:
  11. app: sample-app
  12. template:
  13. metadata:
  14. labels:
  15. app: sample-app
  16. spec:
  17. containers:
  18. - image: luxas/autoscale-demo:v0.1.2
  19. name: metrics-provider
  20. resources:
  21. limits:
  22. cpu: 500m
  23. requests:
  24. cpu: 200m
  25. ports:
  26. - name: http
  27. containerPort: 8080

sample-app.service.yaml

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. labels:
  5. app: sample-app
  6. name: sample-app
  7. spec:
  8. ports:
  9. - name: http
  10. port: 80
  11. protocol: TCP
  12. targetPort: 8080
  13. selector:
  14. app: sample-app
  15. type: ClusterIP
  1. kubectl create -f sample-app.deploy.yaml
  2. kubectl create -f sample-app.service.yaml

When the application is deployed, you can check the http_requests_total Metric with the command

  1. curl http://$(kubectl get service sample-app -o jsonpath='{ .spec.clusterIP }')/metrics

Configure collection rules

Configure Prometheus’ ScrapeConfig to collect the application’s Metric: http_requests_total

  1. kubectl edit configmap -n crane-system prometheus-server

Add the following configuration

  1. - job_name: sample-app
  2. kubernetes_sd_configs:
  3. - role: pod
  4. relabel_configs:
  5. - action: keep
  6. regex: default;sample-app-(.+)
  7. source_labels:
  8. - __meta_kubernetes_namespace
  9. - __meta_kubernetes_pod_name
  10. - action: labelmap
  11. regex: __meta_kubernetes_pod_label_(.+)
  12. - action: replace
  13. source_labels:
  14. - __meta_kubernetes_namespace
  15. target_label: namespace
  16. - source_labels: [__meta_kubernetes_pod_name]
  17. action: replace
  18. target_label: pod

At this point, you can use psql to query Prometheus: sum(rate(http_requests_total[5m])) by (pod)

Verify PrometheusAdapter

The default rule configuration of PrometheusAdapter supports converting http_requests_total to a custom metric of type Pods, verified by the command

  1. kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq .

The result should include pods/http_requests:

  1. {
  2. "name": "pods/http_requests",
  3. "singularName": "",
  4. "namespaced": true,
  5. "kind": "MetricValueList",
  6. "verbs": [
  7. "get"
  8. ]
  9. }

This indicates that the HPA can now be configured via Pod Metric.

Configuring autoscaling

We can now create the Effective HPA. at this point the Effective HPA can be resilient via Pod Metric http_requests:

How to define a custom metric to enable prediction

Annotation in the Effective HPA adds the configuration according to the following rules:

  1. annotations:
  2. # metric-query.autoscaling.crane.io 是固定的前缀,后面是 Metric 名字,需跟 spec.metrics 中的 Metric.name 相同,支持 Pods 类型和 External 类型
  3. metric-query.autoscaling.crane.io/http_requests: "sum(rate(http_requests_total[5m])) by (pod)"

sample-app-hpa.yaml

  1. apiVersion: autoscaling.crane.io/v1alpha1
  2. kind: EffectiveHorizontalPodAutoscaler
  3. metadata:
  4. name: php-apache
  5. annotations:
  6. # metric-query.autoscaling.crane.io 是固定的前缀,后面是 Metric 名字,需跟 spec.metrics 中的 Metric.name 相同,支持 Pods 类型和 External 类型
  7. metric-query.autoscaling.crane.io/http_requests: "sum(rate(http_requests_total[5m])) by (pod)"
  8. spec:
  9. # ScaleTargetRef is the reference to the workload that should be scaled.
  10. scaleTargetRef:
  11. apiVersion: apps/v1
  12. kind: Deployment
  13. name: sample-app
  14. minReplicas: 1 # MinReplicas is the lower limit replicas to the scale target which the autoscaler can scale down to.
  15. maxReplicas: 10 # MaxReplicas is the upper limit replicas to the scale target which the autoscaler can scale up to.
  16. scaleStrategy: Auto # ScaleStrategy indicate the strategy to scaling target, value can be "Auto" and "Manual".
  17. # Metrics contains the specifications for which to use to calculate the desired replica count.
  18. metrics:
  19. - type: Resource
  20. resource:
  21. name: cpu
  22. target:
  23. type: Utilization
  24. averageUtilization: 50
  25. - type: Pods
  26. pods:
  27. metric:
  28. name: http_requests
  29. target:
  30. type: AverageValue
  31. averageValue: 500m
  32. # Prediction defines configurations for predict resources.
  33. # If unspecified, defaults don't enable prediction.
  34. prediction:
  35. predictionWindowSeconds: 3600 # PredictionWindowSeconds is the time window to predict metrics in the future.
  36. predictionAlgorithm:
  37. algorithmType: dsp
  38. dsp:
  39. sampleInterval: "60s"
  40. historyLength: "7d"
  1. kubectl create -f sample-app-hpa.yaml

Check the TimeSeriesPrediction status, which may be unpredictable if the app has been running for a short time:

  1. apiVersion: prediction.crane.io/v1alpha1
  2. kind: TimeSeriesPrediction
  3. metadata:
  4. creationTimestamp: "2022-07-11T16:10:09Z"
  5. generation: 1
  6. labels:
  7. app.kubernetes.io/managed-by: effective-hpa-controller
  8. app.kubernetes.io/name: ehpa-php-apache
  9. app.kubernetes.io/part-of: php-apache
  10. autoscaling.crane.io/effective-hpa-uid: 1322c5ac-a1c6-4c71-98d6-e85d07b22da0
  11. name: ehpa-php-apache
  12. namespace: default
  13. spec:
  14. predictionMetrics:
  15. - algorithm:
  16. algorithmType: dsp
  17. dsp:
  18. estimators: {}
  19. historyLength: 7d
  20. sampleInterval: 60s
  21. resourceIdentifier: crane_pod_cpu_usage
  22. resourceQuery: cpu
  23. type: ResourceQuery
  24. - algorithm:
  25. algorithmType: dsp
  26. dsp:
  27. estimators: {}
  28. historyLength: 7d
  29. sampleInterval: 60s
  30. expressionQuery:
  31. expression: sum(rate(http_requests_total[5m])) by (pod)
  32. resourceIdentifier: crane_custom.pods_http_requests
  33. type: ExpressionQuery
  34. predictionWindowSeconds: 3600
  35. targetRef:
  36. apiVersion: apps/v1
  37. kind: Deployment
  38. name: sample-app
  39. namespace: default
  40. status:
  41. conditions:
  42. - lastTransitionTime: "2022-07-12T06:54:42Z"
  43. message: not all metric predicted
  44. reason: PredictPartial
  45. status: "False"
  46. type: Ready
  47. predictionMetrics:
  48. - ready: false
  49. resourceIdentifier: crane_pod_cpu_usage
  50. - prediction:
  51. - labels:
  52. - name: pod
  53. value: sample-app-7cfb596f98-8h5vv
  54. samples:
  55. - timestamp: 1657608900
  56. value: "0.01683"
  57. - timestamp: 1657608960
  58. value: "0.01683"
  59. ......
  60. ready: true
  61. resourceIdentifier: crane_custom.pods_http_requests

Looking at the HPA object created by Effective HPA, you can observe that a Metric has been created based on custom metrics predictions: crane_custom.pods_http_requests.

  1. apiVersion: autoscaling/v2beta2
  2. kind: HorizontalPodAutoscaler
  3. metadata:
  4. creationTimestamp: "2022-07-11T16:10:10Z"
  5. labels:
  6. app.kubernetes.io/managed-by: effective-hpa-controller
  7. app.kubernetes.io/name: ehpa-php-apache
  8. app.kubernetes.io/part-of: php-apache
  9. autoscaling.crane.io/effective-hpa-uid: 1322c5ac-a1c6-4c71-98d6-e85d07b22da0
  10. name: ehpa-php-apache
  11. namespace: default
  12. spec:
  13. maxReplicas: 10
  14. metrics:
  15. - pods:
  16. metric:
  17. name: http_requests
  18. target:
  19. averageValue: 500m
  20. type: AverageValue
  21. type: Pods
  22. - pods:
  23. metric:
  24. name: crane_custom.pods_http_requests
  25. selector:
  26. matchLabels:
  27. autoscaling.crane.io/effective-hpa-uid: 1322c5ac-a1c6-4c71-98d6-e85d07b22da0
  28. target:
  29. averageValue: 500m
  30. type: AverageValue
  31. type: Pods
  32. - resource:
  33. name: cpu
  34. target:
  35. averageUtilization: 50
  36. type: Utilization
  37. type: Resource
  38. minReplicas: 1
  39. scaleTargetRef:
  40. apiVersion: apps/v1
  41. kind: Deployment
  42. name: sample-app

Summary

Due to the complexity of production environments, multi-metric-based autoscaling (CPU/Memory/custom metrics) is often a common choice for production applications, so Effective HPA achieves the effectiveness of helping more businesses land horizontal autoscaling in production environments by covering multi-metric autoscaling with predictive algorithms.