Traffic Shifting

Traffic splitting and shifting are powerful features that enable operators to dynamically shift traffic to different backend Services. This can be used to implement A/B experiments, red/green deploys, canary rollouts, fault injection and more.

Linkerd supports two different ways to configure traffic shifting: you can use the Linkerd SMI extension and TrafficSplit resources, or you can use HTTPRoute resources which Linkerd natively supports. While certain integrations such as Flagger rely on the SMI and TrafficSplit approach, using HTTPRoute is the preferred method going forward.

Traffic Shifting - 图1

Linkerd Production Tip

This page contains best-effort instructions by the open source community. Production users with mission-critical applications should familiarize themselves with Linkerd production resources and/or connect with a commercial Linkerd provider.

Prerequisites

To use this guide, you’ll need a Kubernetes cluster running:

Set up the demo

We will set up a minimal demo which involves a load generator and two backends called v1 and v2 respectively. You could imagine that these represent two different versions of a service and that we would like to test v2 on a small sample of traffic before rolling it out completely.

For load generation we’ll use Slow-Cooker and for the backends we’ll use BB.

To add these components to your cluster and include them in the Linkerd data plane, run:

  1. cat <<EOF | linkerd inject - | kubectl apply -f -
  2. ---
  3. apiVersion: v1
  4. kind: Namespace
  5. metadata:
  6. name: traffic-shift-demo
  7. ---
  8. apiVersion: apps/v1
  9. kind: Deployment
  10. metadata:
  11. name: v1
  12. namespace: traffic-shift-demo
  13. spec:
  14. replicas: 1
  15. selector:
  16. matchLabels:
  17. app: bb
  18. version: v1
  19. template:
  20. metadata:
  21. labels:
  22. app: bb
  23. version: v1
  24. spec:
  25. containers:
  26. - name: terminus
  27. image: buoyantio/bb:v0.0.6
  28. args:
  29. - terminus
  30. - "--h1-server-port=8080"
  31. - "--response-text=v1"
  32. ports:
  33. - containerPort: 8080
  34. ---
  35. apiVersion: apps/v1
  36. kind: Deployment
  37. metadata:
  38. name: v2
  39. namespace: traffic-shift-demo
  40. spec:
  41. replicas: 1
  42. selector:
  43. matchLabels:
  44. app: bb
  45. version: v2
  46. template:
  47. metadata:
  48. labels:
  49. app: bb
  50. version: v2
  51. spec:
  52. containers:
  53. - name: terminus
  54. image: buoyantio/bb:v0.0.6
  55. args:
  56. - terminus
  57. - "--h1-server-port=8080"
  58. - "--response-text=v2"
  59. ports:
  60. - containerPort: 8080
  61. ---
  62. apiVersion: v1
  63. kind: Service
  64. metadata:
  65. name: bb
  66. namespace: traffic-shift-demo
  67. spec:
  68. ports:
  69. - name: http
  70. port: 8080
  71. targetPort: 8080
  72. selector:
  73. app: bb
  74. version: v1
  75. ---
  76. apiVersion: v1
  77. kind: Service
  78. metadata:
  79. name: bb-v2
  80. namespace: traffic-shift-demo
  81. spec:
  82. ports:
  83. - name: http
  84. port: 8080
  85. targetPort: 8080
  86. selector:
  87. app: bb
  88. version: v2
  89. ---
  90. apiVersion: apps/v1
  91. kind: Deployment
  92. metadata:
  93. name: slow-cooker
  94. namespace: traffic-shift-demo
  95. spec:
  96. replicas: 1
  97. selector:
  98. matchLabels:
  99. app: slow-cooker
  100. template:
  101. metadata:
  102. labels:
  103. app: slow-cooker
  104. spec:
  105. containers:
  106. - args:
  107. - -c
  108. - |
  109. sleep 5 # wait for pods to start
  110. /slow_cooker/slow_cooker --qps 10 http://bb:8080
  111. command:
  112. - /bin/sh
  113. image: buoyantio/slow_cooker:1.3.0
  114. name: slow-cooker
  115. EOF

We can see that slow-cooker is sending traffic to the v1 backend:

  1. > linkerd viz -n traffic-shift-demo stat --from deploy/slow-cooker deploy
  2. NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
  3. v1 1/1 100.00% 10.1rps 1ms 1ms 8ms 1

Shifting Traffic

Now let’s create an HTTPRoute and split 10% of traffic to the v2 backend:

  1. cat <<EOF | kubectl apply -f -
  2. ---
  3. apiVersion: policy.linkerd.io/v1beta2
  4. kind: HTTPRoute
  5. metadata:
  6. name: bb-route
  7. namespace: traffic-shift-demo
  8. spec:
  9. parentRefs:
  10. - name: bb
  11. kind: Service
  12. group: core
  13. port: 8080
  14. rules:
  15. - backendRefs:
  16. - name: bb
  17. port: 8080
  18. weight: 90
  19. - name: bb-v2
  20. port: 8080
  21. weight: 10
  22. EOF

Notice in this HTTPRoute, the parentRef is the bb Service resource that slow-cooker is talking to. This means that whenever a meshed client talks to the bb Service, it will use this HTTPRoute. You may also notice that the bb Service appears again in the list of backendRefs with a weight of 90. This means that 90% of traffic sent to the bb Service will continue on to the endpoints of that Service. The other 10% of requests will get routed to the bb-v2 Service.

We can see this by looking at the traffic stats (keep in mind that the stat command looks at metrics over a 1 minute window, so it may take up to 1 minute before the stats look like this):

  1. > linkerd viz -n traffic-shift-demo stat --from deploy/slow-cooker deploy
  2. NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
  3. v1 1/1 100.00% 9.0rps 1ms 1ms 1ms 1
  4. v2 1/1 100.00% 1.0rps 1ms 1ms 1ms 1

From here, we can continue to tweak the weights in the HTTPRoute to gradually shift traffic over to the bb-v2 Service or shift things back if it’s looking dicey. To conclude this demo, let’s shift 100% of traffic over to bb-v2:

  1. cat <<EOF | kubectl apply -f -
  2. ---
  3. apiVersion: policy.linkerd.io/v1beta2
  4. kind: HTTPRoute
  5. metadata:
  6. name: bb-route
  7. namespace: traffic-shift-demo
  8. spec:
  9. parentRefs:
  10. - name: bb
  11. kind: Service
  12. group: core
  13. port: 8080
  14. rules:
  15. - backendRefs:
  16. - name: bb-v2
  17. port: 8080
  18. weight: 100
  19. EOF
  1. > linkerd viz -n traffic-shift-demo stat --from deploy/slow-cooker deploy
  2. NAME MESHED SUCCESS RPS LATENCY_P50 LATENCY_P95 LATENCY_P99 TCP_CONN
  3. v1 1/1 - - - - - -
  4. v2 1/1 100.00% 10.0rps 1ms 1ms 2ms 1