After you’ve enabled cluster level monitoring, You can view the metrics data from Rancher. You can also deploy the Prometheus custom metrics adapter then you can use the HPA with metrics stored in cluster monitoring.

Deploy Prometheus Custom Metrics Adapter

We are going to use the Prometheus custom metrics adapter, version v0.5.0. This is a great example for the custom metrics server. And you must be the cluster owner to execute following steps.

  • Get the service account of the cluster monitoring is using. It should be configured in the workload ID: statefulset:cattle-prometheus:prometheus-cluster-monitoring. And if you didn’t customize anything, the service account name should be cluster-monitoring.

  • Grant permission to that service account. You will need two kinds of permission. One role is extension-apiserver-authentication-reader in kube-system, so you will need to create a Rolebinding to in kube-system. This permission is to get api aggregation configuration from config map in kube-system.

  1. apiVersion: rbac.authorization.k8s.io/v1
  2. kind: RoleBinding
  3. metadata:
  4. name: custom-metrics-auth-reader
  5. namespace: kube-system
  6. roleRef:
  7. apiGroup: rbac.authorization.k8s.io
  8. kind: Role
  9. name: extension-apiserver-authentication-reader
  10. subjects:
  11. - kind: ServiceAccount
  12. name: cluster-monitoring
  13. namespace: cattle-prometheus

The other one is cluster role system:auth-delegator, so you will need to create a ClusterRoleBinding. This permission is to have subject access review permission.

  1. apiVersion: rbac.authorization.k8s.io/v1
  2. kind: ClusterRoleBinding
  3. metadata:
  4. name: custom-metrics:system:auth-delegator
  5. roleRef:
  6. apiGroup: rbac.authorization.k8s.io
  7. kind: ClusterRole
  8. name: system:auth-delegator
  9. subjects:
  10. - kind: ServiceAccount
  11. name: cluster-monitoring
  12. namespace: cattle-prometheus
  • Create configuration for custom metrics adapter. Following is an example configuration. There will be a configuration details in next session.
  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: adapter-config
  5. namespace: cattle-prometheus
  6. data:
  7. config.yaml: |
  8. rules:
  9. - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
  10. seriesFilters: []
  11. resources:
  12. overrides:
  13. namespace:
  14. resource: namespace
  15. pod_name:
  16. resource: pod
  17. name:
  18. matches: ^container_(.*)_seconds_total$
  19. as: ""
  20. metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
  21. - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
  22. seriesFilters:
  23. - isNot: ^container_.*_seconds_total$
  24. resources:
  25. overrides:
  26. namespace:
  27. resource: namespace
  28. pod_name:
  29. resource: pod
  30. name:
  31. matches: ^container_(.*)_total$
  32. as: ""
  33. metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[1m])) by (<<.GroupBy>>)
  34. - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
  35. seriesFilters:
  36. - isNot: ^container_.*_total$
  37. resources:
  38. overrides:
  39. namespace:
  40. resource: namespace
  41. pod_name:
  42. resource: pod
  43. name:
  44. matches: ^container_(.*)$
  45. as: ""
  46. metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}) by (<<.GroupBy>>)
  47. - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
  48. seriesFilters:
  49. - isNot: .*_total$
  50. resources:
  51. template: <<.Resource>>
  52. name:
  53. matches: ""
  54. as: ""
  55. metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
  56. - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
  57. seriesFilters:
  58. - isNot: .*_seconds_total
  59. resources:
  60. template: <<.Resource>>
  61. name:
  62. matches: ^(.*)_total$
  63. as: ""
  64. metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
  65. - seriesQuery: '{namespace!="",__name__!~"^container_.*"}'
  66. seriesFilters: []
  67. resources:
  68. template: <<.Resource>>
  69. name:
  70. matches: ^(.*)_seconds_total$
  71. as: ""
  72. metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
  73. resourceRules:
  74. cpu:
  75. containerQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)
  76. nodeQuery: sum(rate(container_cpu_usage_seconds_total{<<.LabelMatchers>>, id='/'}[1m])) by (<<.GroupBy>>)
  77. resources:
  78. overrides:
  79. instance:
  80. resource: node
  81. namespace:
  82. resource: namespace
  83. pod_name:
  84. resource: pod
  85. containerLabel: container_name
  86. memory:
  87. containerQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>}) by (<<.GroupBy>>)
  88. nodeQuery: sum(container_memory_working_set_bytes{<<.LabelMatchers>>,id='/'}) by (<<.GroupBy>>)
  89. resources:
  90. overrides:
  91. instance:
  92. resource: node
  93. namespace:
  94. resource: namespace
  95. pod_name:
  96. resource: pod
  97. containerLabel: container_name
  98. window: 1m
  • Create HTTPS TLS certs for your api server. You can use following command to create a self-signed cert.
  1. openssl req -new -newkey rsa:4096 -x509 -sha256 -days 365 -nodes -out serving.crt -keyout serving.key -subj "/C=CN/CN=custom-metrics-apiserver.cattle-prometheus.svc.cluster.local"
  2. # And you will find serving.crt and serving.key in your path. And then you are going to create a secret in cattle-prometheus namespace.
  3. kubectl create secret generic -n cattle-prometheus cm-adapter-serving-certs --from-file=serving.key=./serving.key --from-file=serving.crt=./serving.crt
  • Then you can create the prometheus custom metrics adapter. And you will need a service for this deployment too. Creating it via Import YAML or Rancher would do. Please create those resources in cattle-prometheus namespaces.

Here is the prometheus custom metrics adapter deployment.

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. labels:
  5. app: custom-metrics-apiserver
  6. name: custom-metrics-apiserver
  7. namespace: cattle-prometheus
  8. spec:
  9. replicas: 1
  10. selector:
  11. matchLabels:
  12. app: custom-metrics-apiserver
  13. template:
  14. metadata:
  15. labels:
  16. app: custom-metrics-apiserver
  17. name: custom-metrics-apiserver
  18. spec:
  19. serviceAccountName: cluster-monitoring
  20. containers:
  21. - name: custom-metrics-apiserver
  22. image: directxman12/k8s-prometheus-adapter-amd64:v0.5.0
  23. args:
  24. - --secure-port=6443
  25. - --tls-cert-file=/var/run/serving-cert/serving.crt
  26. - --tls-private-key-file=/var/run/serving-cert/serving.key
  27. - --logtostderr=true
  28. - --prometheus-url=http://prometheus-operated/
  29. - --metrics-relist-interval=1m
  30. - --v=10
  31. - --config=/etc/adapter/config.yaml
  32. ports:
  33. - containerPort: 6443
  34. volumeMounts:
  35. - mountPath: /var/run/serving-cert
  36. name: volume-serving-cert
  37. readOnly: true
  38. - mountPath: /etc/adapter/
  39. name: config
  40. readOnly: true
  41. - mountPath: /tmp
  42. name: tmp-vol
  43. volumes:
  44. - name: volume-serving-cert
  45. secret:
  46. secretName: cm-adapter-serving-certs
  47. - name: config
  48. configMap:
  49. name: adapter-config
  50. - name: tmp-vol
  51. emptyDir: {}

Here is the service of the deployment.

  1. apiVersion: v1
  2. kind: Service
  3. metadata:
  4. name: custom-metrics-apiserver
  5. namespace: cattle-prometheus
  6. spec:
  7. ports:
  8. - port: 443
  9. targetPort: 6443
  10. selector:
  11. app: custom-metrics-apiserver
  • Create API service for your custom metric server.
  1. apiVersion: apiregistration.k8s.io/v1beta1
  2. kind: APIService
  3. metadata:
  4. name: v1beta1.custom.metrics.k8s.io
  5. spec:
  6. service:
  7. name: custom-metrics-apiserver
  8. namespace: cattle-prometheus
  9. group: custom.metrics.k8s.io
  10. version: v1beta1
  11. insecureSkipTLSVerify: true
  12. groupPriorityMinimum: 100
  13. versionPriority: 100
  • Then you can verify your custom metrics server by kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1. If you see the return datas from the api, it means that the metrics server has been successfully set up.

  • You create HPA with custom metrics now. Here is an example of HPA. You will need to create a nginx deployment in your namespace first.

  1. kind: HorizontalPodAutoscaler
  2. apiVersion: autoscaling/v2beta1
  3. metadata:
  4. name: nginx
  5. spec:
  6. scaleTargetRef:
  7. # point the HPA at the nginx deployment you just created
  8. apiVersion: apps/v1
  9. kind: Deployment
  10. name: nginx
  11. # autoscale between 1 and 10 replicas
  12. minReplicas: 1
  13. maxReplicas: 10
  14. metrics:
  15. # use a "Pods" metric, which takes the average of the
  16. # given metric across all pods controlled by the autoscaling target
  17. - type: Pods
  18. pods:
  19. metricName: memory_usage_bytes
  20. targetAverageValue: 5000000

And then, you should see your nginx is scaling up. HPA with custom metrics works.

Configuration of prometheus custom metrics adapter

Refer to https://github.com/DirectXMan12/k8s-prometheus-adapter/blob/master/docs/config.md

The adapter determines which metrics to expose, and how to expose them, through a set of “discovery” rules. Each rule is executed independently (so make sure that your rules are mutually exclusive), and specifies each of the steps the adapter needs to take to expose a metric in the API.

Each rule can be broken down into roughly four parts:

  • Discovery, which specifies how the adapter should find all Prometheus metrics for this rule.

  • Association, which specifies how the adapter should determine which Kubernetes resources a particular metric is associated with.

  • Naming, which specifies how the adapter should expose the metric in the custom metrics API.

  • Querying, which specifies how a request for a particular metric on one or more Kubernetes objects should be turned into a query to Prometheus.

A basic config with one rule might look like:

  1. rules:
  2. # this rule matches cumulative cAdvisor metrics measured in seconds
  3. - seriesQuery: '{__name__=~"^container_.*",container_name!="POD",namespace!="",pod_name!=""}'
  4. resources:
  5. # skip specifying generic resource<->label mappings, and just
  6. # attach only pod and namespace resources by mapping label names to group-resources
  7. overrides:
  8. namespace: {resource: "namespace"},
  9. pod_name: {resource: "pod"},
  10. # specify that the `container_` and `_seconds_total` suffixes should be removed.
  11. # this also introduces an implicit filter on metric family names
  12. name:
  13. # we use the value of the capture group implicitly as the API name
  14. # we could also explicitly write `as: "$1"`
  15. matches: "^container_(.*)_seconds_total$"
  16. # specify how to construct a query to fetch samples for a given series
  17. # This is a Go template where the `.Series` and `.LabelMatchers` string values
  18. # are available, and the delimiters are `<<` and `>>` to avoid conflicts with
  19. # the prometheus query language
  20. metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[2m])) by (<<.GroupBy>>)"

Discovery

Discovery governs the process of finding the metrics that you want to expose in the custom metrics API. There are two fields that factor into discovery: seriesQuery and seriesFilters.

seriesQuery specifies Prometheus series query (as passed to the /api/v1/series endpoint in Prometheus) to use to find some set of Prometheus series. The adapter will strip the label values from this series, and then use the resulting metric-name-label-names combinations later on.

In many cases, seriesQuery will be sufficient to narrow down the list of Prometheus series. However, sometimes (especially if two rules might otherwise overlap), it’s useful to do additional filtering on metric names. In this case, seriesFilters can be used. After the list of series is returned from seriesQuery, each series has its metric name filtered through any specified filters.

Filters may be either:

  • is: <regex>, which matches any series whose name matches the specified regex.

  • isNot: <regex>, which matches any series whose name does not match the specified regex.

For example:

  1. # match all cAdvisor metrics that aren't measured in seconds
  2. seriesQuery: '{__name__=~"^container_.*_total",container_name!="POD",namespace!="",pod_name!=""}'
  3. seriesFilters:
  4. isNot: "^container_.*_seconds_total"

Association

Association governs the process of figuring out which Kubernetes resources a particular metric could be attached to. The resources field controls this process.

There are two ways to associate resources with a particular metric. In both cases, the value of the label becomes the name of the particular object.

One way is to specify that any label name that matches some particular pattern refers to some group-resource based on the label name. This can be done using the template field. The pattern is specified as a Go template, with the Group and Resource fields representing group and resource. You don’t necessarily have to use the Group field (in which case the group is guessed by the system). For instance:

  1. # any label `kube_<group>_<resource>` becomes <group>.<resource> in Kubernetes
  2. resources:
  3. template: "kube_<<.Group>>_<<.Resource>>"

The other way is to specify that some particular label represents some particular Kubernetes resource. This can be done using the overrides field. Each override maps a Prometheus label to a Kubernetes group-resource. For instance:

  1. # the microservice label corresponds to the apps.deployment resource
  2. resource:
  3. overrides:
  4. microservice: {group: "apps", resource: "deployment"}

These two can be combined, so you can specify both a template and some individual overrides.

The resources mentioned can be any resource available in your kubernetes cluster, as long as you’ve got a corresponding label.

Naming

Naming governs the process of converting a Prometheus metric name into a metric in the custom metrics API, and vice versa. It’s controlled by the name field.

Naming is controlled by specifying a pattern to extract an API name from a Prometheus name, and potentially a transformation on that extracted value.

The pattern is specified in the matches field, and is just a regular expression. If not specified, it defaults to .*.

The transformation is specified by the as field. You can use any capture groups defined in the matches field. If the matches field doesn’t contain capture groups, the as field defaults to $0. If it contains a single capture group, the as field defautls to $1. Otherwise, it’s an error not to specify the as field.

For example:

  1. # match turn any name <name>_total to <name>_per_second
  2. # e.g. http_requests_total becomes http_requests_per_second
  3. name:
  4. matches: "^(.*)_total$"
  5. as: "${1}_per_second"

Querying

Querying governs the process of actually fetching values for a particular metric. It’s controlled by the metricsQuery field.

The metricsQuery field is a Go template that gets turned into a Prometheus query, using input from a particular call to the custom metrics API. A given call to the custom metrics API is distilled down to a metric name, a group-resource, and one or more objects of that group-resource. These get turned into the following fields in the template:

  • Series: the metric name
  • LabelMatchers: a comma-separated list of label matchers matching the given objects. Currently, this is the label for the particular group-resource, plus the label for namespace, if the group-resource is namespaced.
  • GroupBy: a comma-separated list of labels to group by. Currently, this contains the group-resource label used in LabelMatchers.

For instance, suppose we had a series http_requests_total (exposed as http_requests_per_second in the API) with labels service, pod, ingress, namespace, and verb. The first four correspond to Kubernetes resources. Then, if someone requested the metric pods/http_request_per_second for the pods pod1 and pod2 in the somens namespace, we’d have:

  • Series: "http_requests_total"
  • LabelMatchers: "pod=~\"pod1|pod2",namespace="somens"
  • GroupBy: pod

Additionally, there are two advanced fields that are “raw” forms of other fields:

  • LabelValuesByName: a map mapping the labels and values from the LabelMatchers field. The values are pre-joined by | (for used with the =~ matcher in Prometheus).
  • GroupBySlice: the slice form of GroupBy.

In general, you’ll probably want to use the Series, LabelMatchers, and GroupBy fields. The other two are for advanced usage.

The query is expected to return one value for each object requested. The adapter will use the labels on the returned series to associate a given series back to its corresponding object.

For example:

  1. # convert cumulative cAdvisor metrics into rates calculated over 2 minutes
  2. metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>,container_name!="POD"}[2m])) by (<<.GroupBy>>)"