Understanding custom metrics autoscaler triggers

Triggers, also known as scalers, provide the metrics that the Custom Metrics Autoscaler Operator uses to scale your pods.

The custom metrics autoscaler currently supports only the Prometheus, CPU, memory, and Apache Kafka triggers.

You use a ScaledObject or ScaledJob custom resource to configure triggers for specific objects, as described in the sections that follow.

Understanding the Prometheus trigger

You can scale pods based on Prometheus metrics, which can use the installed OKD monitoring or an external Prometheus server as the metrics source. See “Additional resources” for information on the configurations required to use the OKD monitoring as a source for metrics.

If Prometheus is collecting metrics from the application that the custom metrics autoscaler is scaling, do not set the minimum replicas to 0 in the custom resource. If there are no application pods, the custom metrics autoscaler does not have any metrics to scale on.

Example scaled object with a Prometheus target

  1. apiVersion: keda.sh/v1alpha1
  2. kind: ScaledObject
  3. metadata:
  4. name: prom-scaledobject
  5. namespace: my-namespace
  6. spec:
  7. # ...
  8. triggers:
  9. - type: prometheus (1)
  10. metadata:
  11. serverAddress: https://thanos-querier.openshift-monitoring.svc.cluster.local:9092 (2)
  12. namespace: kedatest (3)
  13. metricName: http_requests_total (4)
  14. threshold: '5' (5)
  15. query: sum(rate(http_requests_total{job="test-app"}[1m])) (6)
  16. authModes: basic (7)
  17. cortexOrgID: my-org (8)
  18. ignoreNullValues: false (9)
  19. unsafeSsl: false (10)
1Specifies Prometheus as the trigger type.
2Specifies the address of the Prometheus server. This example uses OKD monitoring.
3Optional: Specifies the namespace of the object you want to scale. This parameter is mandatory if using OKD monitoring as a source for the metrics.
4Specifies the name to identify the metric in the external.metrics.k8s.io API. If you are using more than one trigger, all metric names must be unique.
5Specifies the value that triggers scaling. Must be specified as a quoted string value.
6Specifies the Prometheus query to use.
7Specifies the authentication method to use. Prometheus scalers support bearer authentication (bearer), basic authentication (basic), or TLS authentication (tls). You configure the specific authentication parameters in a trigger authentication, as discussed in a following section. As needed, you can also use a secret.
8Optional: Passes the X-Scope-OrgID header to multi-tenant Cortex or Mimir storage for Prometheus. This parameter is required only with multi-tenant Prometheus storage, to indicate which data Prometheus should return.
9Optional: Specifies how the trigger should proceed if the Prometheus target is lost.
  • If true, the trigger continues to operate if the Prometheus target is lost. This is the default behavior.

  • If false, the trigger returns an error if the Prometheus target is lost.

10Optional: Specifies whether the certificate check should be skipped. For example, you might skip the check if you use self-signed certificates at the Prometheus endpoint.
  • If true, the certificate check is performed.

  • If false, the certificate check is not performed. This is the default behavior.

Configuring the custom metrics autoscaler to use OKD monitoring

You can use the installed OKD Prometheus monitoring as a source for the metrics used by the custom metrics autoscaler. However, there are some additional configurations you must perform.

These steps are not required for an external Prometheus source.

You must perform the following tasks, as described in this section:

  • Create a service account to get a token.

  • Create a role.

  • Add that role to the service account.

  • Reference the token in the trigger authentication object used by Prometheus.

Prerequisites

  • OKD monitoring must be installed.

  • Monitoring of user-defined workloads must be enabled in OKD monitoring, as described in the Creating a user-defined workload monitoring config map section.

  • The Custom Metrics Autoscaler Operator must be installed.

Procedure

  1. Change to the project with the object you want to scale:

    1. $ oc project my-project
  2. Use the following command to create a service account, if your cluster does not have one:

    1. $ oc create serviceaccount <service_account>

    where:

    <service_account>

    Specifies the name of the service account.

  3. Use the following command to locate the token assigned to the service account:

    1. $ oc describe serviceaccount <service_account>

    where:

    <service_account>

    Specifies the name of the service account.

    Example output

    1. Name: thanos
    2. Namespace: my-project
    3. Labels: <none>
    4. Annotations: <none>
    5. Image pull secrets: thanos-dockercfg-nnwgj
    6. Mountable secrets: thanos-dockercfg-nnwgj
    7. Tokens: thanos-token-9g4n5 (1)
    8. Events: <none>
    1Use this token in the trigger authentication.
  4. Create a trigger authentication with the service account token:

    1. Create a YAML file similar to the following:

      1. apiVersion: keda.sh/v1alpha1
      2. kind: TriggerAuthentication
      3. metadata:
      4. name: keda-trigger-auth-prometheus
      5. spec:
      6. secretTargetRef: (1)
      7. - parameter: bearerToken (2)
      8. name: thanos-token-9g4n5 (3)
      9. key: token (4)
      10. - parameter: ca
      11. name: thanos-token-9g4n5
      12. key: ca.crt
      1Specifies that this object uses a secret for authorization.
      2Specifies the authentication parameter to supply by using the token.
      3Specifies the name of the token to use.
      4Specifies the key in the token to use with the specified parameter.
    2. Create the CR object:

      1. $ oc create -f <file-name>.yaml
  5. Create a role for reading Thanos metrics:

    1. Create a YAML file with the following parameters:

      1. apiVersion: rbac.authorization.k8s.io/v1
      2. kind: Role
      3. metadata:
      4. name: thanos-metrics-reader
      5. rules:
      6. - apiGroups:
      7. - ""
      8. resources:
      9. - pods
      10. verbs:
      11. - get
      12. - apiGroups:
      13. - metrics.k8s.io
      14. resources:
      15. - pods
      16. - nodes
      17. verbs:
      18. - get
      19. - list
      20. - watch
    2. Create the CR object:

      1. $ oc create -f <file-name>.yaml
  6. Create a role binding for reading Thanos metrics:

    1. Create a YAML file similar to the following:

      1. apiVersion: rbac.authorization.k8s.io/v1
      2. kind: RoleBinding
      3. metadata:
      4. name: thanos-metrics-reader (1)
      5. namespace: my-project (2)
      6. roleRef:
      7. apiGroup: rbac.authorization.k8s.io
      8. kind: Role
      9. name: thanos-metrics-reader
      10. subjects:
      11. - kind: ServiceAccount
      12. name: thanos (3)
      13. namespace: my-project (4)
      1Specifies the name of the role you created.
      2Specifies the namespace of the object you want to scale.
      3Specifies the name of the service account to bind to the role.
      4Specifies the namespace of the object you want to scale.
    2. Create the CR object:

      1. $ oc create -f <file-name>.yaml

You can now deploy a scaled object or scaled job to enable autoscaling for your application, as described in “Understanding how to add custom metrics autoscalers”. To use OKD monitoring as the source, in the trigger, or scaler, you must include the following parameters:

  • triggers.type must be prometheus

  • triggers.metadata.serverAddress must be https://thanos-querier.openshift-monitoring.svc.cluster.local:9092

  • triggers.metadata.authModes must be bearer

  • triggers.metadata.namespace must be set to the namespace of the object to scale

  • triggers.authenticationRef must point to the trigger authentication resource specified in the previous step

Understanding the CPU trigger

You can scale pods based on CPU metrics. This trigger uses cluster metrics as the source for metrics.

The custom metrics autoscaler scales the pods associated with an object to maintain the CPU usage that you specify. The autoscaler increases or decreases the number of replicas between the minimum and maximum numbers to maintain the specified CPU utilization across all pods. The memory trigger considers the memory utilization of the entire pod. If the pod has multiple containers, the memory trigger considers the total memory utilization of all containers in the pod.

  • This trigger cannot be used with the ScaledJob custom resource.

  • When using a memory trigger to scale an object, the object does not scale to 0, even if you are using multiple triggers.

Example scaled object with a CPU target

  1. apiVersion: keda.sh/v1alpha1
  2. kind: ScaledObject
  3. metadata:
  4. name: cpu-scaledobject
  5. namespace: my-namespace
  6. spec:
  7. # ...
  8. triggers:
  9. - type: cpu (1)
  10. metricType: Utilization (2)
  11. metadata:
  12. value: '60' (3)
  13. containerName: api (4)
1Specifies CPU as the trigger type.
2Specifies the type of metric to use, either Utilization or AverageValue.
3Specifies the value that triggers scaling. Must be specified as a quoted string value.
  • When using Utilization, the target value is the average of the resource metrics across all relevant pods, represented as a percentage of the requested value of the resource for the pods.

  • When using AverageValue, the target value is the average of the metrics across all relevant pods.

4Optional: Specifies an individual container to scale, based on the memory utilization of only that container, rather than the entire pod. In this example, only the container named api is to be scaled.

Understanding the memory trigger

You can scale pods based on memory metrics. This trigger uses cluster metrics as the source for metrics.

The custom metrics autoscaler scales the pods associated with an object to maintain the average memory usage that you specify. The autoscaler increases and decreases the number of replicas between the minimum and maximum numbers to maintain the specified memory utilization across all pods. The memory trigger considers the memory utilization of entire pod. If the pod has multiple containers, the memory utilization is the sum of all of the containers.

  • This trigger cannot be used with the ScaledJob custom resource.

  • When using a memory trigger to scale an object, the object does not scale to 0, even if you are using multiple triggers.

Example scaled object with a memory target

  1. apiVersion: keda.sh/v1alpha1
  2. kind: ScaledObject
  3. metadata:
  4. name: memory-scaledobject
  5. namespace: my-namespace
  6. spec:
  7. # ...
  8. triggers:
  9. - type: memory (1)
  10. metricType: Utilization (2)
  11. metadata:
  12. value: '60' (3)
  13. containerName: api (4)
1Specifies memory as the trigger type.
2Specifies the type of metric to use, either Utilization or AverageValue.
3Specifies the value that triggers scaling. Must be specified as a quoted string value.
  • When using Utilization, the target value is the average of the resource metrics across all relevant pods, represented as a percentage of the requested value of the resource for the pods.

  • When using AverageValue, the target value is the average of the metrics across all relevant pods.

4Optional: Specifies an individual container to scale, based on the memory utilization of only that container, rather than the entire pod. In this example, only the container named api is to be scaled.

Understanding the Kafka trigger

You can scale pods based on an Apache Kafka topic or other services that support the Kafka protocol. The custom metrics autoscaler does not scale higher than the number of Kafka partitions, unless you set the allowIdleConsumers parameter to true in the scaled object or scaled job.

If the number of consumer groups exceeds the number of partitions in a topic, the extra consumer groups remain idle. To avoid this, by default the number of replicas does not exceed:

  • The number of partitions on a topic, if a topic is specified

  • The number of partitions of all topics in the consumer group, if no topic is specified

  • The maxReplicaCount specified in scaled object or scaled job CR

You can use the allowIdleConsumers parameter to disable these default behaviors.

Example scaled object with a Kafka target

  1. apiVersion: keda.sh/v1alpha1
  2. kind: ScaledObject
  3. metadata:
  4. name: kafka-scaledobject
  5. namespace: my-namespace
  6. spec:
  7. # ...
  8. triggers:
  9. - type: kafka (1)
  10. metadata:
  11. topic: my-topic (2)
  12. bootstrapServers: my-cluster-kafka-bootstrap.openshift-operators.svc:9092 (3)
  13. consumerGroup: my-group (4)
  14. lagThreshold: '10' (5)
  15. activationLagThreshold: '5' (6)
  16. offsetResetPolicy: latest (7)
  17. allowIdleConsumers: true (8)
  18. scaleToZeroOnInvalidOffset: false (9)
  19. excludePersistentLag: false (10)
  20. version: '1.0.0' (11)
  21. partitionLimitation: '1,2,10-20,31' (12)
1Specifies Kafka as the trigger type.
2Specifies the name of the Kafka topic on which Kafka is processing the offset lag.
3Specifies a comma-separated list of Kafka brokers to connect to.
4Specifies the name of the Kafka consumer group used for checking the offset on the topic and processing the related lag.
5Optional: Specifies the average target value that triggers scaling. Must be specified as a quoted string value. The default is 5.
6Optional: Specifies the target value for the activation phase. Must be specified as a quoted string value.
7Optional: Specifies the Kafka offset reset policy for the Kafka consumer. The available values are: latest and earliest. The default is latest.
8Optional: Specifies whether the number of Kafka replicas can exceed the number of partitions on a topic.
  • If true, the number of Kafka replicas can exceed the number of partitions on a topic. This allows for idle Kafka consumers.

  • If false, the number of Kafka replicas cannot exceed the number of partitions on a topic. This is the default.

9Specifies how the trigger behaves when a Kafka partition does not have a valid offset.
  • If true, the consumers are scaled to zero for that partition.

  • If false, the scaler keeps a single consumer for that partition. This is the default.

10Optional: Specifies whether the trigger includes or excludes partition lag for partitions whose current offset is the same as the current offset of the previous polling cycle.
  • If true, the scaler excludes partition lag in these partitions.

  • If false, the trigger includes all consumer lag in all partitions. This is the default.

11Optional: Specifies the version of your Kafka brokers. Must be specified as a quoted string value. The default is 1.0.0.
12Optional: Specifies a comma-separated list of partition IDs to scope the scaling on. If set, only the listed IDs are considered when calculating lag. Must be specified as a quoted string value. The default is to consider all partitions.