本指南演示了如何为 L4 TCP 连接配置速率限制,目标主机是 OSM 托管服务网格的一部分。

前置条件

  • Kubernetes 集群版本 v1.22.9 或者更高。
  • 已安装 OSM。
  • 已安装 kubectl 用来与 API 服务器进行交互。
  • 已安装 osm CLI 用于管理服务网格。
  • 已安装 OSM,版本 >= v1.2.0。

演示

以下演示显示了一个客户端 fortio-client 将 TCP 流量发送到 fortio TCP echo 服务。fortio 服务将 TCP 消息回显给客户端。我们将看到应用针对 fortio 服务的本地 TCP 速率限制策略来控制发往服务后端的流量的影响。

  1. 为简单起见,启用 宽松流量策略模式 以便网格内的应用程序连接不需要显式的 SMI 流量访问策略。

    1. export osm_namespace=osm-system # Replace osm-system with the namespace where OSM is installed
    2. kubectl patch meshconfig osm-mesh-config -n "$osm_namespace" -p '{"spec":{"traffic":{"enablePermissiveTrafficPolicyMode":true}}}' --type=merge
  2. 在将其命名空间注册到网格后,在 demo 命名空间中部署 fortio TCP echo 服务。fortio TCP echo 服务在端口 8078 上运行。

    1. # Create the demo namespace
    2. kubectl create namespace demo
    3. # Add the namespace to the mesh
    4. osm namespace add demo
    5. # Deploy fortio TCP echo in the demo namespace
    6. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/fortio/fortio.yaml -n demo

    确认fortio 服务 pod 启动并运行。

    1. $ kubectl get pods -n demo
    2. NAME READY STATUS RESTARTS AGE
    3. fortio-c4bd7857f-7mm6w 2/2 Running 0 22m
  3. demo 命名空间中部署 fortio-client 应用程序。我们将使用此客户端将 TCP 流量发送到之前部署的 fortio TCP echo 服务。

    1. kubectl apply -f https://raw.githubusercontent.com/openservicemesh/osm-docs/release-v1.2/manifests/samples/fortio/fortio-client.yaml -n demo

    确认 fortio-client pod 启动并运行。

    1. NAME READY STATUS RESTARTS AGE
    2. fortio-client-b9b7bbfb8-prq7r 2/2 Running 0 7s
  4. 确认 fortio-client 应用程序能够成功建立 TCP 连接并将数据发送到端口 8078 上的 frotio TCP echo 服务。我们使用 3 并发连接 (-c 3) 调用 fortio 服务并发送 10 次调用 (-n 10)。

    1. $ fortio_client="$(kubectl get pod -n demo -l app=fortio-client -o jsonpath='{.items[0].metadata.name}')"
    2. $ kubectl exec "$fortio_client" -n demo -c fortio-client -- fortio load -qps -1 -c 3 -n 10 tcp://fortio.demo.svc.cluster.local:8078
    3. Fortio 1.32.3 running at -1 queries per second, 8->8 procs, for 10 calls: tcp://fortio.demo.svc.cluster.local:8078
    4. 20:41:47 I tcprunner.go:238> Starting tcp test for tcp://fortio.demo.svc.cluster.local:8078 with 3 threads at -1.0 qps
    5. Starting at max qps with 3 thread(s) [gomax 8] for exactly 10 calls (3 per thread + 1)
    6. 20:41:47 I periodic.go:723> T001 ended after 34.0563ms : 3 calls. qps=88.0894283876992
    7. 20:41:47 I periodic.go:723> T000 ended after 35.3117ms : 4 calls. qps=113.2769025563765
    8. 20:41:47 I periodic.go:723> T002 ended after 44.0273ms : 3 calls. qps=68.13954069406937
    9. Ended after 44.2097ms : 10 calls. qps=226.19
    10. Aggregated Function Time : count 10 avg 0.01096615 +/- 0.01386 min 0.001588 max 0.0386716 sum 0.1096615
    11. # range, mid point, percentile, count
    12. >= 0.001588 <= 0.002 , 0.001794 , 40.00, 4
    13. > 0.002 <= 0.003 , 0.0025 , 60.00, 2
    14. > 0.003 <= 0.004 , 0.0035 , 70.00, 1
    15. > 0.025 <= 0.03 , 0.0275 , 90.00, 2
    16. > 0.035 <= 0.0386716 , 0.0368358 , 100.00, 1
    17. # target 50% 0.0025
    18. # target 75% 0.02625
    19. # target 90% 0.03
    20. # target 99% 0.0383044
    21. # target 99.9% 0.0386349
    22. Error cases : no data
    23. Sockets used: 3 (for perfect no error run, would be 3)
    24. Total Bytes sent: 240, received: 240
    25. tcp OK : 10 (100.0 %)
    26. All done 10 calls (plus 0 warmup) 10.966 ms avg, 226.2 qps

    如上所示,来自 fortio-client pod 的所有 TCP 连接都成功了。

    1. Total Bytes sent: 240, received: 240
    2. tcp OK : 10 (100.0 %)
    3. All done 10 calls (plus 0 warmup) 10.966 ms avg, 226.2 qps
  5. 接下来,应用本地速率限制策略,将到 fortio.demo.svc.cluster.local 服务的 L4 TCP 连接速率限制为“每分钟 1 个连接”。

    1. kubectl apply -f - <<EOF
    2. apiVersion: policy.openservicemesh.io/v1alpha1
    3. kind: UpstreamTrafficSetting
    4. metadata:
    5. name: tcp-rate-limit
    6. namespace: demo
    7. spec:
    8. host: fortio.demo.svc.cluster.local
    9. rateLimit:
    10. local:
    11. tcp:
    12. connections: 1
    13. unit: minute
    14. EOF

    通过检查 fortio 后端 pod 上的统计信息,确认没有流量受到速率限制。

    1. $ fortio_server="$(kubectl get pod -n demo -l app=fortio -o jsonpath='{.items[0].metadata.name}')"
    2. $ osm proxy get stats "$fortio_server" -n demo | grep fortio.*8078.*rate_limit
    3. local_rate_limit.inbound_demo/fortio_8078_tcp.rate_limited: 0
  6. 确认 TCP 连接受速率限制。

    1. $ kubectl exec "$fortio_client" -n demo -c fortio-client -- fortio load -qps -1 -c 3 -n 10 tcp://fortio.demo.svc.cluster.local:8078
    2. Fortio 1.32.3 running at -1 queries per second, 8->8 procs, for 10 calls: tcp://fortio.demo.svc.cluster.local:8078
    3. 20:49:38 I tcprunner.go:238> Starting tcp test for tcp://fortio.demo.svc.cluster.local:8078 with 3 threads at -1.0 qps
    4. Starting at max qps with 3 thread(s) [gomax 8] for exactly 10 calls (3 per thread + 1)
    5. 20:49:38 E tcprunner.go:203> [2] Unable to read: read tcp 10.244.1.19:59244->10.96.83.254:8078: read: connection reset by peer
    6. 20:49:38 E tcprunner.go:203> [0] Unable to read: read tcp 10.244.1.19:59246->10.96.83.254:8078: read: connection reset by peer
    7. 20:49:38 E tcprunner.go:203> [2] Unable to read: read tcp 10.244.1.19:59258->10.96.83.254:8078: read: connection reset by peer
    8. 20:49:38 E tcprunner.go:203> [0] Unable to read: read tcp 10.244.1.19:59260->10.96.83.254:8078: read: connection reset by peer
    9. 20:49:38 E tcprunner.go:203> [2] Unable to read: read tcp 10.244.1.19:59266->10.96.83.254:8078: read: connection reset by peer
    10. 20:49:38 I periodic.go:723> T002 ended after 9.643ms : 3 calls. qps=311.1065021258944
    11. 20:49:38 E tcprunner.go:203> [0] Unable to read: read tcp 10.244.1.19:59268->10.96.83.254:8078: read: connection reset by peer
    12. 20:49:38 E tcprunner.go:203> [0] Unable to read: read tcp 10.244.1.19:59274->10.96.83.254:8078: read: connection reset by peer
    13. 20:49:38 I periodic.go:723> T000 ended after 14.8212ms : 4 calls. qps=269.8836801338623
    14. 20:49:38 I periodic.go:723> T001 ended after 20.3458ms : 3 calls. qps=147.45057948077735
    15. Ended after 20.5468ms : 10 calls. qps=486.69
    16. Aggregated Function Time : count 10 avg 0.00438853 +/- 0.004332 min 0.0014184 max 0.0170216 sum 0.0438853
    17. # range, mid point, percentile, count
    18. >= 0.0014184 <= 0.002 , 0.0017092 , 20.00, 2
    19. > 0.002 <= 0.003 , 0.0025 , 50.00, 3
    20. > 0.003 <= 0.004 , 0.0035 , 70.00, 2
    21. > 0.004 <= 0.005 , 0.0045 , 90.00, 2
    22. > 0.016 <= 0.0170216 , 0.0165108 , 100.00, 1
    23. # target 50% 0.003
    24. # target 75% 0.00425
    25. # target 90% 0.005
    26. # target 99% 0.0169194
    27. # target 99.9% 0.0170114
    28. Error cases : count 7 avg 0.0034268714 +/- 0.0007688 min 0.0024396 max 0.0047932 sum 0.0239881
    29. # range, mid point, percentile, count
    30. >= 0.0024396 <= 0.003 , 0.0027198 , 42.86, 3
    31. > 0.003 <= 0.004 , 0.0035 , 71.43, 2
    32. > 0.004 <= 0.0047932 , 0.0043966 , 100.00, 2
    33. # target 50% 0.00325
    34. # target 75% 0.00409915
    35. # target 90% 0.00451558
    36. # target 99% 0.00476544
    37. # target 99.9% 0.00479042
    38. Sockets used: 8 (for perfect no error run, would be 3)
    39. Total Bytes sent: 240, received: 72
    40. tcp OK : 3 (30.0 %)
    41. tcp short read : 7 (70.0 %)
    42. All done 10 calls (plus 0 warmup) 4.389 ms avg, 486.7 qps

    如上所示,10 次调用中只有 30% 成功,而剩余的 70% 是速率受限的。这是因为我们在 fortio 后端服务上应用了每分钟 1 个连接的限速策略,而 fortio-client 能够使用 1 个连接进行 3/10 调用,从而导致 30% 的成功率。

    检查 sidecar 统计数据以进一步确认这一点。

    1. $ osm proxy get stats "$fortio_server" -n demo | grep 'fortio.*8078.*rate_limit'
    2. local_rate_limit.inbound_demo/fortio_8078_tcp.rate_limited: 7
  7. 接下来,让我们更新我们的速率限制策略以允许突发连接。突发允许给定数量的连接超过我们的速率限制策略定义的每分钟 1 个连接的基准速率。

    1. kubectl apply -f - <<EOF
    2. apiVersion: policy.openservicemesh.io/v1alpha1
    3. kind: UpstreamTrafficSetting
    4. metadata:
    5. name: tcp-echo-limit
    6. namespace: demo
    7. spec:
    8. host: fortio.demo.svc.cluster.local
    9. rateLimit:
    10. local:
    11. tcp:
    12. connections: 1
    13. unit: minute
    14. burst: 10
    15. EOF
  8. 确认突发功能允许在很短的时间窗口内突发连接。

    1. $ kubectl exec "$fortio_client" -n demo -c fortio-client -- fortio load -qps -1 -c 3 -n 10 tcp://fortio.demo.svc.cluster.local:8078
    2. Fortio 1.32.3 running at -1 queries per second, 8->8 procs, for 10 calls: tcp://fortio.demo.svc.cluster.local:8078
    3. 20:56:56 I tcprunner.go:238> Starting tcp test for tcp://fortio.demo.svc.cluster.local:8078 with 3 threads at -1.0 qps
    4. Starting at max qps with 3 thread(s) [gomax 8] for exactly 10 calls (3 per thread + 1)
    5. 20:56:56 I periodic.go:723> T002 ended after 5.1568ms : 3 calls. qps=581.7561278312132
    6. 20:56:56 I periodic.go:723> T001 ended after 5.2334ms : 3 calls. qps=573.2411052088509
    7. 20:56:56 I periodic.go:723> T000 ended after 5.2464ms : 4 calls. qps=762.4275693809088
    8. Ended after 5.2711ms : 10 calls. qps=1897.1
    9. Aggregated Function Time : count 10 avg 0.00153124 +/- 0.001713 min 0.00033 max 0.0044054 sum 0.0153124
    10. # range, mid point, percentile, count
    11. >= 0.00033 <= 0.001 , 0.000665 , 70.00, 7
    12. > 0.003 <= 0.004 , 0.0035 , 80.00, 1
    13. > 0.004 <= 0.0044054 , 0.0042027 , 100.00, 2
    14. # target 50% 0.000776667
    15. # target 75% 0.0035
    16. # target 90% 0.0042027
    17. # target 99% 0.00438513
    18. # target 99.9% 0.00440337
    19. Error cases : no data
    20. Sockets used: 3 (for perfect no error run, would be 3)
    21. Total Bytes sent: 240, received: 240
    22. tcp OK : 10 (100.0 %)
    23. All done 10 calls (plus 0 warmup) 1.531 ms avg, 1897.1 qps

    如上所示,来自 fortio-client pod 的所有 TCP 连接都成功了。

    1. Total Bytes sent: 240, received: 240
    2. tcp OK : 10 (100.0 %)
    3. All done 10 calls (plus 0 warmup) 1.531 ms avg, 1897.1 qps

    此外,检查统计数据以确认突发允许额外的连接通过。自从我们配置突发设置之前的上一次速率限制测试以来,连接速率限制的数量没有增加。

    1. $ osm proxy get stats "$fortio_server" -n demo | grep 'fortio.*8078.*rate_limit'
    2. local_rate_limit.inbound_demo/fortio_8078_tcp.rate_limited: 7