Testing HPAs with kubectl

This document describes how to check the status of your HPAs after scaling them up or down with your load testing tool. For information on how to check the status from the Rancher UI (at least version 2.3.x), refer to Managing HPAs with the Rancher UI.

For HPA to work correctly, service deployments should have resources request definitions for containers. Follow this hello-world example to test if HPA is working correctly.

  1. Configure kubectl to connect to your Kubernetes cluster.

  2. Copy the hello-world deployment manifest below.

    Hello World Manifest

    1. apiVersion: apps/v1beta2
    2. kind: Deployment
    3. metadata:
    4. labels:
    5. app: hello-world
    6. name: hello-world
    7. namespace: default
    8. spec:
    9. replicas: 1
    10. selector:
    11. matchLabels:
    12. app: hello-world
    13. strategy:
    14. rollingUpdate:
    15. maxSurge: 1
    16. maxUnavailable: 0
    17. type: RollingUpdate
    18. template:
    19. metadata:
    20. labels:
    21. app: hello-world
    22. spec:
    23. containers:
    24. - image: rancher/hello-world
    25. imagePullPolicy: Always
    26. name: hello-world
    27. resources:
    28. requests:
    29. cpu: 500m
    30. memory: 64Mi
    31. ports:
    32. - containerPort: 80
    33. protocol: TCP
    34. restartPolicy: Always
    35. ---
    36. apiVersion: v1
    37. kind: Service
    38. metadata:
    39. name: hello-world
    40. namespace: default
    41. spec:
    42. ports:
    43. - port: 80
    44. protocol: TCP
    45. targetPort: 80
    46. selector:
    47. app: hello-world
  3. Deploy it to your cluster.

    1. # kubectl create -f <HELLO_WORLD_MANIFEST>
  4. Copy one of the HPAs below based on the metric type you’re using:

    Hello World HPA: Resource Metrics

    1. apiVersion: autoscaling/v2beta1
    2. kind: HorizontalPodAutoscaler
    3. metadata:
    4. name: hello-world
    5. namespace: default
    6. spec:
    7. scaleTargetRef:
    8. apiVersion: extensions/v1beta1
    9. kind: Deployment
    10. name: hello-world
    11. minReplicas: 1
    12. maxReplicas: 10
    13. metrics:
    14. - type: Resource
    15. resource:
    16. name: cpu
    17. targetAverageUtilization: 50
    18. - type: Resource
    19. resource:
    20. name: memory
    21. targetAverageValue: 1000Mi

    Hello World HPA: Custom Metrics

    1. apiVersion: autoscaling/v2beta1
    2. kind: HorizontalPodAutoscaler
    3. metadata:
    4. name: hello-world
    5. namespace: default
    6. spec:
    7. scaleTargetRef:
    8. apiVersion: extensions/v1beta1
    9. kind: Deployment
    10. name: hello-world
    11. minReplicas: 1
    12. maxReplicas: 10
    13. metrics:
    14. - type: Resource
    15. resource:
    16. name: cpu
    17. targetAverageUtilization: 50
    18. - type: Resource
    19. resource:
    20. name: memory
    21. targetAverageValue: 100Mi
    22. - type: Pods
    23. pods:
    24. metricName: cpu_system
    25. targetAverageValue: 20m
  5. View the HPA info and description. Confirm that metric data is shown.

    Resource Metrics

    1. Enter the following commands.

      1. # kubectl get hpa
      2. NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
      3. hello-world Deployment/hello-world 1253376 / 100Mi, 0% / 50% 1 10 1 6m
      4. # kubectl describe hpa
      5. Name: hello-world
      6. Namespace: default
      7. Labels: <none>
      8. Annotations: <none>
      9. CreationTimestamp: Mon, 23 Jul 2018 20:21:16 +0200
      10. Reference: Deployment/hello-world
      11. Metrics: ( current / target )
      12. resource memory on pods: 1253376 / 100Mi
      13. resource cpu on pods (as a percentage of request): 0% (0) / 50%
      14. Min replicas: 1
      15. Max replicas: 10
      16. Conditions:
      17. Type Status Reason Message
      18. ---- ------ ------ -------
      19. AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
      20. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
      21. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      22. Events: <none>

      Custom Metrics

    2. Enter the following command.

      1. # kubectl describe hpa

      You should receive the output that follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Tue, 24 Jul 2018 18:36:28 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 3514368 / 100Mi
      9. "cpu_system" on pods: 0 / 20m
      10. resource cpu on pods (as a percentage of request): 0% (0) / 50%
      11. Min replicas: 1
      12. Max replicas: 10
      13. Conditions:
      14. Type Status Reason Message
      15. ---- ------ ------ -------
      16. AbleToScale True ReadyForNewScale the last scale time was sufficiently old as to warrant a new scale
      17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
      18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      19. Events: <none>
  6. Generate a load for the service to test that your pods autoscale as intended. You can use any load-testing tool (Hey, Gatling, etc.), but we’re using Hey.

  7. Test that pod autoscaling works as intended.

    To Test Autoscaling Using Resource Metrics:

    Upscale to 2 Pods: CPU Usage Up to Target

    Use your load testing tool to scale up to two pods based on CPU Usage.

    1. View your HPA.

      1. # kubectl describe hpa

      You should receive output similar to what follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 10928128 / 100Mi
      9. resource cpu on pods (as a percentage of request): 56% (280m) / 50%
      10. Min replicas: 1
      11. Max replicas: 10
      12. Conditions:
      13. Type Status Reason Message
      14. ---- ------ ------ -------
      15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
      16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      18. Events:
      19. Type Reason Age From Message
      20. ---- ------ ---- ---- -------
      21. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    2. Enter the following command to confirm you’ve scaled to two pods.

      1. # kubectl get pods

      You should receive output similar to what follows:

      1. NAME READY STATUS RESTARTS AGE
      2. hello-world-54764dfbf8-k8ph2 1/1 Running 0 1m
      3. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

      Upscale to 3 pods: CPU Usage Up to Target

    Use your load testing tool to upscale to 3 pods based on CPU usage with horizontal-pod-autoscaler-upscale-delay set to 3 minutes.

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive output similar to what follows

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 9424896 / 100Mi
      9. resource cpu on pods (as a percentage of request): 66% (333m) / 50%
      10. Min replicas: 1
      11. Max replicas: 10
      12. Conditions:
      13. Type Status Reason Message
      14. ---- ------ ------ -------
      15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
      16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      18. Events:
      19. Type Reason Age From Message
      20. ---- ------ ---- ---- -------
      21. Normal SuccessfulRescale 4m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      22. Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
    2. Enter the following command to confirm three pods are running.

      1. # kubectl get pods

      You should receive output similar to what follows.

      1. NAME READY STATUS RESTARTS AGE
      2. hello-world-54764dfbf8-f46kh 0/1 Running 0 1m
      3. hello-world-54764dfbf8-k8ph2 1/1 Running 0 5m
      4. hello-world-54764dfbf8-q6l4v 1/1 Running 0 3h

      Downscale to 1 Pod: All Metrics Below Target

    Use your load testing to scale down to 1 pod when all metrics are below target for horizontal-pod-autoscaler-downscale-delay (5 minutes by default).

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive output similar to what follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Mon, 23 Jul 2018 22:22:04 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 10070016 / 100Mi
      9. resource cpu on pods (as a percentage of request): 0% (0) / 50%
      10. Min replicas: 1
      11. Max replicas: 10
      12. Conditions:
      13. Type Status Reason Message
      14. ---- ------ ------ -------
      15. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
      16. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
      17. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      18. Events:
      19. Type Reason Age From Message
      20. ---- ------ ---- ---- -------
      21. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      22. Normal SuccessfulRescale 6m horizontal-pod-autoscaler New size: 3; reason: cpu resource utilization (percentage of request) above target
      23. Normal SuccessfulRescale 1s horizontal-pod-autoscaler New size: 1; reason: All metrics below target

    To Test Autoscaling Using Custom Metrics:

    Upscale to 2 Pods: CPU Usage Up to Target

    Use your load testing tool to upscale two pods based on CPU usage.

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive output similar to what follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 8159232 / 100Mi
      9. "cpu_system" on pods: 7m / 20m
      10. resource cpu on pods (as a percentage of request): 64% (321m) / 50%
      11. Min replicas: 1
      12. Max replicas: 10
      13. Conditions:
      14. Type Status Reason Message
      15. ---- ------ ------ -------
      16. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
      17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      19. Events:
      20. Type Reason Age From Message
      21. ---- ------ ---- ---- -------
      22. Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
    2. Enter the following command to confirm two pods are running.

      1. # kubectl get pods

      You should receive output similar to what follows.

      1. NAME READY STATUS RESTARTS AGE
      2. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3s
      3. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

      Upscale to 3 Pods: CPU Usage Up to Target

    Use your load testing tool to scale up to three pods when the cpu_system usage limit is up to target.

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive output similar to what follows:

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 8374272 / 100Mi
      9. "cpu_system" on pods: 27m / 20m
      10. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
      11. Min replicas: 1
      12. Max replicas: 10
      13. Conditions:
      14. Type Status Reason Message
      15. ---- ------ ------ -------
      16. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
      17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      19. Events:
      20. Type Reason Age From Message
      21. ---- ------ ---- ---- -------
      22. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      23. Normal SuccessfulRescale 3s horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
    2. Enter the following command to confirm three pods are running.

      1. # kubectl get pods

      You should receive output similar to what follows:

      1. # kubectl get pods
      2. NAME READY STATUS RESTARTS AGE
      3. hello-world-54764dfbf8-5pfdr 1/1 Running 0 3m
      4. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
      5. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

      Upscale to 4 Pods: CPU Usage Up to Target

    Use your load testing tool to upscale to four pods based on CPU usage. horizontal-pod-autoscaler-upscale-delay is set to three minutes by default.

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive output similar to what follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 8374272 / 100Mi
      9. "cpu_system" on pods: 27m / 20m
      10. resource cpu on pods (as a percentage of request): 71% (357m) / 50%
      11. Min replicas: 1
      12. Max replicas: 10
      13. Conditions:
      14. Type Status Reason Message
      15. ---- ------ ------ -------
      16. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 3
      17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
      18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      19. Events:
      20. Type Reason Age From Message
      21. ---- ------ ---- ---- -------
      22. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      23. Normal SuccessfulRescale 3m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
      24. Normal SuccessfulRescale 4s horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
    2. Enter the following command to confirm four pods are running.

      1. # kubectl get pods

      You should receive output similar to what follows.

      1. NAME READY STATUS RESTARTS AGE
      2. hello-world-54764dfbf8-2p9xb 1/1 Running 0 5m
      3. hello-world-54764dfbf8-5pfdr 1/1 Running 0 2m
      4. hello-world-54764dfbf8-m2hrl 1/1 Running 0 1s
      5. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h

      Downscale to 1 Pod: All Metrics Below Target

    Use your load testing tool to scale down to one pod when all metrics below target for horizontal-pod-autoscaler-downscale-delay.

    1. Enter the following command.

      1. # kubectl describe hpa

      You should receive similar output to what follows.

      1. Name: hello-world
      2. Namespace: default
      3. Labels: <none>
      4. Annotations: <none>
      5. CreationTimestamp: Tue, 24 Jul 2018 18:01:11 +0200
      6. Reference: Deployment/hello-world
      7. Metrics: ( current / target )
      8. resource memory on pods: 8101888 / 100Mi
      9. "cpu_system" on pods: 8m / 20m
      10. resource cpu on pods (as a percentage of request): 0% (0) / 50%
      11. Min replicas: 1
      12. Max replicas: 10
      13. Conditions:
      14. Type Status Reason Message
      15. ---- ------ ------ -------
      16. AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 1
      17. ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from memory resource
      18. ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
      19. Events:
      20. Type Reason Age From Message
      21. ---- ------ ---- ---- -------
      22. Normal SuccessfulRescale 10m horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
      23. Normal SuccessfulRescale 8m horizontal-pod-autoscaler New size: 3; reason: pods metric cpu_system above target
      24. Normal SuccessfulRescale 5m horizontal-pod-autoscaler New size: 4; reason: cpu resource utilization (percentage of request) above target
      25. Normal SuccessfulRescale 13s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
    2. Enter the following command to confirm a single pods is running.

      1. # kubectl get pods

      You should receive output similar to what follows.

      1. NAME READY STATUS RESTARTS AGE
      2. hello-world-54764dfbf8-q6l82 1/1 Running 0 6h