本指南演示了如何为 OSM 托管服务网格的外部目标配置熔断。

先决条件

  • Kubernetes 集群运行版本 v1.22.9 或者更高。
  • 已安装 OSM。
  • 使用 kubectl 与 API server 交互。
  • 已安装 osm 命令行工具,用于管理服务网格。
  • OSM 版本不低于 v1.1.0

演示

下面的演示展示了一个负载测试客户端 fortio 将流量发送到服务网格的外部 service httpbin 。发送到网格服务外的流量被认为是 出口 流量, 其会被 出口流量策略 进行授权。我们将看到为外部 httpbin service 配置的流量熔断器被触发后,是如何影响 fortio 客户端。

  1. 部署 httpbin service 到 httpbin 命名空间。httpbin service 运行在 14001 端口,且没有纳入网格管理,因此可以看成是网格外部的服务。

    1. # Create the httpbin namespace
    2. kubectl create namespace httpbin
    3. # Deploy httpbin service in the httpbin namespace
    4. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/httpbin/httpbin.yaml -n httpbin

    确认 httpbin service 和 pod 启动并运行。

    1. $ kubectl get svc -n httpbin
    2. NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    3. httpbin ClusterIP 10.96.198.23 <none> 14001/TCP 20s
    1. $ kubectl get pods -n httpbin
    2. NAME READY STATUS RESTARTS AGE
    3. httpbin-5b8b94b9-lt2vs 1/1 Running 0 20s
  2. fortio 负载测试客户端部署到 client 命名空间,并纳入服务网格。

    1. # Create the client namespace
    2. kubectl create namespace client
    3. # Add the namespace to the mesh
    4. osm namespace add client
    5. # Deploy fortio client in the client namespace
    6. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/fortio/fortio.yaml -n client

    确认 fortio 客户端 pod 启动并运行

    1. $ kubectl get pods -n client
    2. NAME READY STATUS RESTARTS AGE
    3. fortio-6477f8495f-bj4s9 2/2 Running 0 19s
  3. 配置 Egress 策略允许 client 命名空间下的 fortio 客户端可以与外部的 httpbin service 进行通信。HTTP 请求将被发送到域名 httpbin.httpbin.svc.cluster.local14001 端口。

    1. kubectl apply -f - <<EOF
    2. kind: Egress
    3. apiVersion: policy.openservicemesh.io/v1alpha1
    4. metadata:
    5. name: httpbin-external
    6. namespace: client
    7. spec:
    8. sources:
    9. - kind: ServiceAccount
    10. name: default
    11. namespace: client
    12. hosts:
    13. - httpbin.httpbin.svc.cluster.local
    14. ports:
    15. - number: 14001
    16. protocol: http
    17. EOF
  4. 确认 fortio 客户端可以成功发送 HTTP 请求到外部 service httpbin.httpbin.svc.cluster.local14001 端口。使用 5 个并发(-c 5)发送 50 个请求(-n 50)到外部服务。

    1. $ export fortio_pod="$(kubectl get pod -n client -l app=fortio -o jsonpath='{.items[0].metadata.name}')"
    2. $ kubectl exec "$fortio_pod" -c fortio -n client -- /usr/bin/fortio load -c 5 -qps 0 -n 50 -loglevel Warning http://httpbin.httpbin.svc.cluster.local:14001/get
    3. 19:56:34 I logger.go:127> Log level is now 3 Warning (was 2 Info)
    4. Fortio 1.17.1 running at 0 queries per second, 8->8 procs, for 50 calls: http://httpbin.httpbin.svc.cluster.local:14001/get
    5. Starting at max qps with 5 thread(s) [gomax 8] for exactly 50 calls (10 per thread + 0)
    6. Ended after 36.3659ms : 50 calls. qps=1374.9
    7. Aggregated Function Time : count 50 avg 0.003374618 +/- 0.0007546 min 0.0013124 max 0.0066215 sum 0.1687309
    8. # range, mid point, percentile, count
    9. >= 0.0013124 <= 0.002 , 0.0016562 , 4.00, 2
    10. > 0.002 <= 0.003 , 0.0025 , 10.00, 3
    11. > 0.003 <= 0.004 , 0.0035 , 86.00, 38
    12. > 0.004 <= 0.005 , 0.0045 , 98.00, 6
    13. > 0.006 <= 0.0066215 , 0.00631075 , 100.00, 1
    14. # target 50% 0.00352632
    15. # target 75% 0.00385526
    16. # target 90% 0.00433333
    17. # target 99% 0.00631075
    18. # target 99.9% 0.00659043
    19. Sockets used: 5 (for perfect keepalive, would be 5)
    20. Jitter: false
    21. Code 200 : 50 (100.0 %)
    22. Response Header Sizes : count 50 avg 230 +/- 0 min 230 max 230 sum 11500
    23. Response Body/Total Sizes : count 50 avg 460 +/- 0 min 460 max 460 sum 23000
    24. All done 50 calls (plus 0 warmup) 3.375 ms avg, 1374.9 qps

    如上所示,全部请求成功。

    1. Code 200 : 50 (100.0 %)
  5. 使用 UpstreamTrafficSetting 资源为请求到外部域名 httpbin.httpbin.svc.cluster.local 的流量应用配置熔断器,并将最大并发连接数和请求数限制为1。为外部(出口)流量应用 UpstreamTrafficSetting 配置时,UpstreamTrafficSetting 资源还必须在 Egress 配置中指定为匹配项,并且与匹配的 Egress 资源属于同一命名空间。这是为外部流量执行熔断限制时所必需的。因此,我们也要在之前应用的 Egress 配置中更新添加一个 matches 字段。

    1. kubectl apply -f - <<EOF
    2. apiVersion: policy.openservicemesh.io/v1alpha1
    3. kind: UpstreamTrafficSetting
    4. metadata:
    5. name: httpbin-external
    6. namespace: client
    7. spec:
    8. host: httpbin.httpbin.svc.cluster.local
    9. connectionSettings:
    10. tcp:
    11. maxConnections: 1
    12. http:
    13. maxPendingRequests: 1
    14. maxRequestsPerConnection: 1
    15. ---
    16. kind: Egress
    17. apiVersion: policy.openservicemesh.io/v1alpha1
    18. metadata:
    19. name: httpbin-external
    20. namespace: client
    21. spec:
    22. sources:
    23. - kind: ServiceAccount
    24. name: default
    25. namespace: client
    26. hosts:
    27. - httpbin.httpbin.svc.cluster.local
    28. ports:
    29. - number: 14001
    30. protocol: http
    31. matches:
    32. - apiGroup: policy.openservicemesh.io/v1alpha1
    33. kind: UpstreamTrafficSetting
    34. name: httpbin-external
    35. EOF
  6. 确认由于上面配置的连接和请求级别熔断限制,fortio客户端无法发出与以前相同数量的成功请求。

    1. $ kubectl exec "$fortio_pod" -c fortio -n client -- /usr/bin/fortio load -c 5 -qps 0 -n 50 -loglevel Warning http://httpbin.httpbin.svc.cluster.local:14001/get
    2. 19:58:48 I logger.go:127> Log level is now 3 Warning (was 2 Info)
    3. Fortio 1.17.1 running at 0 queries per second, 8->8 procs, for 50 calls: http://httpbin.httpbin.svc.cluster.local:14001/get
    4. Starting at max qps with 5 thread(s) [gomax 8] for exactly 50 calls (10 per thread + 0)
    5. 19:58:48 W http_client.go:806> [0] Non ok http code 503 (HTTP/1.1 503)
    6. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    7. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    8. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    9. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    10. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    11. 19:58:48 W http_client.go:806> [1] Non ok http code 503 (HTTP/1.1 503)
    12. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    13. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    14. 19:58:48 W http_client.go:806> [1] Non ok http code 503 (HTTP/1.1 503)
    15. 19:58:48 W http_client.go:806> [4] Non ok http code 503 (HTTP/1.1 503)
    16. 19:58:48 W http_client.go:806> [3] Non ok http code 503 (HTTP/1.1 503)
    17. 19:58:48 W http_client.go:806> [3] Non ok http code 503 (HTTP/1.1 503)
    18. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    19. 19:58:48 W http_client.go:806> [3] Non ok http code 503 (HTTP/1.1 503)
    20. 19:58:48 W http_client.go:806> [2] Non ok http code 503 (HTTP/1.1 503)
    21. 19:58:48 W http_client.go:806> [3] Non ok http code 503 (HTTP/1.1 503)
    22. 19:58:48 W http_client.go:806> [1] Non ok http code 503 (HTTP/1.1 503)
    23. 19:58:48 W http_client.go:806> [3] Non ok http code 503 (HTTP/1.1 503)
    24. 19:58:48 W http_client.go:806> [1] Non ok http code 503 (HTTP/1.1 503)
    25. 19:58:48 W http_client.go:806> [4] Non ok http code 503 (HTTP/1.1 503)
    26. Ended after 33.1549ms : 50 calls. qps=1508.1
    27. Aggregated Function Time : count 50 avg 0.002467842 +/- 0.001827 min 0.0003724 max 0.0067697 sum 0.1233921
    28. # range, mid point, percentile, count
    29. >= 0.0003724 <= 0.001 , 0.0006862 , 34.00, 17
    30. > 0.001 <= 0.002 , 0.0015 , 50.00, 8
    31. > 0.002 <= 0.003 , 0.0025 , 60.00, 5
    32. > 0.003 <= 0.004 , 0.0035 , 84.00, 12
    33. > 0.004 <= 0.005 , 0.0045 , 88.00, 2
    34. > 0.005 <= 0.006 , 0.0055 , 92.00, 2
    35. > 0.006 <= 0.0067697 , 0.00638485 , 100.00, 4
    36. # target 50% 0.002
    37. # target 75% 0.003625
    38. # target 90% 0.0055
    39. # target 99% 0.00667349
    40. # target 99.9% 0.00676008
    41. Sockets used: 25 (for perfect keepalive, would be 5)
    42. Jitter: false
    43. Code 200 : 29 (58.0 %)
    44. Code 503 : 21 (42.0 %)
    45. Response Header Sizes : count 50 avg 133.4 +/- 113.5 min 0 max 230 sum 6670
    46. Response Body/Total Sizes : count 50 avg 368.02 +/- 108.1 min 241 max 460 sum 18401
    47. All done 50 calls (plus 0 warmup) 2.468 ms avg, 1508.1 qps

    如上所示,只有 58% 的请求成功,其余的请求在熔断器打开时失败

    1. Code 200 : 29 (58.0 %)
    2. Code 503 : 21 (42.0 %)
  7. 检查 Envoy sidecar 统计信息以查看与触发断路器的请求有关的统计信息。

    1. $ osm proxy get stats $fortio_pod -n client | grep 'httpbin.*pending'
    2. cluster.httpbin_httpbin_svc_cluster_local_14001.circuit_breakers.default.remaining_pending: 1
    3. cluster.httpbin_httpbin_svc_cluster_local_14001.circuit_breakers.default.rq_pending_open: 0
    4. cluster.httpbin_httpbin_svc_cluster_local_14001.circuit_breakers.high.rq_pending_open: 0
    5. cluster.httpbin_httpbin_svc_cluster_local_14001.upstream_rq_pending_active: 0
    6. cluster.httpbin_httpbin_svc_cluster_local_14001.upstream_rq_pending_failure_eject: 0
    7. cluster.httpbin_httpbin_svc_cluster_local_14001.upstream_rq_pending_overflow: 21
    8. cluster.httpbin_httpbin_svc_cluster_local_14001.upstream_rq_pending_total: 29

    cluster.httpbin_httpbin_svc_cluster_local_14001.upstream_rq_pending_overflow: 21 表示有 21 个请求触发了熔断器,这与上一步中看到的失败请求数相匹配:Code 200 : 29 (58.0 %)