通过 Prometheus 分片机制将超大租户数据有效切分

Prometheus 水平分片

因为超大的监控规模,导致需要被监控的 instance 非常庞大时,单个 Prometheus 压力过大,我们可以可以通过 Prometheus 的hashmod relabel action 来优化性能。通过这种办法,面对成千上万的 instance 时,一台 Prometheus 只需要监控其中的所有各种各样实例的一部分 instance。

在 modulus 里,配置了 4 为基数。每个 Prometheus 只抓取 1/4,比如上面的配置就只抓取 hashmod 后 __temp_hash 为 2 的 targets。抓取完成后,可以再通过 remote_write 对这 4 台 Prometheus Server 的数据进行聚合。

  1. global:
  2. external_labels:
  3. env: prod
  4. scraper: 2
  5. scrape_configs:
  6. - job_name: my_job
  7. ...
  8. relabel_configs:
  9. - source_labels: [__address__]
  10. modulus: 4
  11. target_label: __tmp_hash
  12. action: hashmod
  13. - source_labels: [__tmp_hash]
  14. regex: 2
  15. action: keep

通过水平分片数据切分多个 Whizard Tenant 内

上小节我们可以看到 Prometheus 可以通过 hashmod 的方式将一个超大规模的 instance 拆分到多个 Prometheus Server 中, 但数据查询时还需要聚合,就需要我们通过 remote_write 将数据写入到 Whizard 或其他方案中。这里我们借助 Prometheus-Operator 的 SHARD 机制,同时将数据写入不同 Whizard 租户内。

Prometheus 配置:

  1. apiVersion: monitoring.coreos.com/v1
  2. kind: Prometheus
  3. metadata:
  4. labels:
  5. app.kubernetes.io/component: prometheus
  6. app.kubernetes.io/instance: k8s
  7. app.kubernetes.io/name: prometheus
  8. app.kubernetes.io/part-of: kube-prometheus
  9. app.kubernetes.io/version: 2.46.0
  10. name: k8s
  11. namespace: monitoring
  12. spec:
  13. alerting:
  14. alertmanagers:
  15. - apiVersion: v2
  16. name: alertmanager-main
  17. namespace: monitoring
  18. port: web
  19. enableFeatures: []
  20. evaluationInterval: 30s
  21. externalLabels: {}
  22. image: quay.io/prometheus/prometheus:v2.46.0
  23. nodeSelector:
  24. kubernetes.io/os: linux
  25. podMetadata:
  26. labels:
  27. app.kubernetes.io/component: prometheus
  28. app.kubernetes.io/instance: k8s
  29. app.kubernetes.io/name: prometheus
  30. app.kubernetes.io/part-of: kube-prometheus
  31. app.kubernetes.io/version: 2.46.0
  32. podMonitorNamespaceSelector: {}
  33. podMonitorSelector: {}
  34. portName: web
  35. probeNamespaceSelector: {}
  36. probeSelector: {}
  37. remoteWrite:
  38. - url: http://172.31.73.196:30990/tenant-$(SHARD)/api/v1/receive # 将分片变量 ${SHAED} 写入租户路径中,实现数据切分到多个租户中
  39. replicas: 1
  40. resources:
  41. requests:
  42. memory: 400Mi
  43. ruleNamespaceSelector: {}
  44. ruleSelector: {}
  45. scrapeInterval: 30s
  46. securityContext:
  47. fsGroup: 2000
  48. runAsNonRoot: true
  49. runAsUser: 1000
  50. serviceAccountName: prometheus-k8s
  51. serviceMonitorNamespaceSelector: {}
  52. serviceMonitorSelector: {}
  53. shards: 4 # 设置分片数,将采集的 Metrics 拆分为 4 个Prometheus 采集
  54. version: 2.46.0
  55. status:
  56. availableReplicas: 4
  57. conditions:
  58. - lastTransitionTime: "2023-08-17T06:06:15Z"
  59. observedGeneration: 5
  60. status: "True"
  61. type: Available
  62. - lastTransitionTime: "2023-08-17T06:06:16Z"
  63. observedGeneration: 5
  64. status: "True"
  65. type: Reconciled
  66. paused: false
  67. replicas: 4
  68. shardStatuses:
  69. - availableReplicas: 1
  70. replicas: 1
  71. shardID: "0"
  72. unavailableReplicas: 0
  73. updatedReplicas: 1
  74. - availableReplicas: 1
  75. replicas: 1
  76. shardID: "1"
  77. unavailableReplicas: 0
  78. updatedReplicas: 1
  79. - availableReplicas: 1
  80. replicas: 1
  81. shardID: "2"
  82. unavailableReplicas: 0
  83. updatedReplicas: 1
  84. - availableReplicas: 1
  85. replicas: 1
  86. shardID: "3"
  87. unavailableReplicas: 0
  88. updatedReplicas: 1
  89. unavailableReplicas: 0
  90. updatedReplicas: 4

Prometheus Pod 状态:

  1. # 查看 Prometheus Pod 状态
  2. # kubectl get po -n monitoring -l app.kubernetes.io/component=prometheus
  3. NAME READY STATUS RESTARTS AGE
  4. prometheus-k8s-0 2/2 Running 0 48m
  5. prometheus-k8s-shard-1-0 2/2 Running 0 48m
  6. prometheus-k8s-shard-2-0 2/2 Running 0 48m
  7. prometheus-k8s-shard-3-0 2/2 Running 0 48m

Prometheus-Operator 的 SHARD 机制,会自动将 ServerMonitor 的采集配置注入 hashmod,这是分片数为 0 的Prometheus 的配置文件,可以看到只采集 hashmod 为 0 的采集项。

  1. - source_labels: [__address__]
  2. separator: ;
  3. regex: (.*)
  4. modulus: 4
  5. target_label: __tmp_hash
  6. replacement: $1
  7. action: hashmod
  8. - source_labels: [__tmp_hash]
  9. separator: ;
  10. regex: "0"
  11. replacement: $1
  12. action: keep