设置 Amazon 云提供商

使用 Amazon 云提供商时,你可以利用以下功能:

  • 负载均衡器:在 Port Mapping 中选择 Layer-4 Load Balancer 或使用 type: LoadBalancer 启动 Service 时,启动 AWS 弹性负载均衡器 (ELB)。
  • 持久卷:允许你将 AWS 弹性块存储 (EBS) 用于持久卷。

有关 Amazon 云提供商的所有信息,请参阅 cloud-provider-aws 自述文件

要设置 Amazon 云提供商:

  1. 创建一个 IAM 角色并附加到实例
  2. 配置 ClusterID

设置 Amazon 云提供商 - 图1重要提示:

从 Kubernetes 1.23 开始,你必须停用 CSIMigrationAWS 特性开关才能使用树内 AWS 云提供商。为此,你可以在高级集群配置中将 feature-gates=CSIMigrationAWS=false 设置为集群 Kubelet、Controller Manager、API Server 和 Scheduler 的附加参数。

1. 创建 IAM 角色并附加到实例

添加到集群的所有节点都必须能够与 EC2 交互,以便它们可以创建和删除资源。你可以使用附加到实例的 IAM 角色来启用交互。请参阅 Amazon 文档:创建 IAM 角色 来创建 IAM 角色。有两个示例策略:

  • 第一个策略适用于具有 controlplane 角色的节点。这些节点必须能够创建/删除 EC2 资源。以下 IAM 策略是一个示例,请根据你的实际用例移除不需要的权限。
  • 第二个策略适用于具有 etcdworker 角色的节点。这些节点只需能够从 EC2 检索信息。

在创建 Amazon EC2 集群时,你必须在创建节点模板时填写创建的 IAM 角色的 IAM Instance Profile Name(不是 ARN)。

创建自定义集群时,你必须手动将 IAM 角色附加到实例。

具有 controlplane 角色的节点的 IAM 策略:

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "autoscaling:DescribeAutoScalingGroups",
  8. "autoscaling:DescribeLaunchConfigurations",
  9. "autoscaling:DescribeTags",
  10. "ec2:DescribeInstances",
  11. "ec2:DescribeRegions",
  12. "ec2:DescribeRouteTables",
  13. "ec2:DescribeSecurityGroups",
  14. "ec2:DescribeSubnets",
  15. "ec2:DescribeVolumes",
  16. "ec2:CreateSecurityGroup",
  17. "ec2:CreateTags",
  18. "ec2:CreateVolume",
  19. "ec2:ModifyInstanceAttribute",
  20. "ec2:ModifyVolume",
  21. "ec2:AttachVolume",
  22. "ec2:AuthorizeSecurityGroupIngress",
  23. "ec2:CreateRoute",
  24. "ec2:DeleteRoute",
  25. "ec2:DeleteSecurityGroup",
  26. "ec2:DeleteVolume",
  27. "ec2:DetachVolume",
  28. "ec2:RevokeSecurityGroupIngress",
  29. "ec2:DescribeVpcs",
  30. "elasticloadbalancing:AddTags",
  31. "elasticloadbalancing:AttachLoadBalancerToSubnets",
  32. "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",
  33. "elasticloadbalancing:CreateLoadBalancer",
  34. "elasticloadbalancing:CreateLoadBalancerPolicy",
  35. "elasticloadbalancing:CreateLoadBalancerListeners",
  36. "elasticloadbalancing:ConfigureHealthCheck",
  37. "elasticloadbalancing:DeleteLoadBalancer",
  38. "elasticloadbalancing:DeleteLoadBalancerListeners",
  39. "elasticloadbalancing:DescribeLoadBalancers",
  40. "elasticloadbalancing:DescribeLoadBalancerAttributes",
  41. "elasticloadbalancing:DetachLoadBalancerFromSubnets",
  42. "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
  43. "elasticloadbalancing:ModifyLoadBalancerAttributes",
  44. "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
  45. "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer",
  46. "elasticloadbalancing:AddTags",
  47. "elasticloadbalancing:CreateListener",
  48. "elasticloadbalancing:CreateTargetGroup",
  49. "elasticloadbalancing:DeleteListener",
  50. "elasticloadbalancing:DeleteTargetGroup",
  51. "elasticloadbalancing:DescribeListeners",
  52. "elasticloadbalancing:DescribeLoadBalancerPolicies",
  53. "elasticloadbalancing:DescribeTargetGroups",
  54. "elasticloadbalancing:DescribeTargetHealth",
  55. "elasticloadbalancing:ModifyListener",
  56. "elasticloadbalancing:ModifyTargetGroup",
  57. "elasticloadbalancing:RegisterTargets",
  58. "elasticloadbalancing:SetLoadBalancerPoliciesOfListener",
  59. "iam:CreateServiceLinkedRole",
  60. "kms:DescribeKey"
  61. ],
  62. "Resource": [
  63. "*"
  64. ]
  65. }
  66. ]
  67. }

具有 etcdworker 角色的节点的 IAM 策略:

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "ec2:DescribeInstances",
  8. "ec2:DescribeRegions",
  9. "ecr:GetAuthorizationToken",
  10. "ecr:BatchCheckLayerAvailability",
  11. "ecr:GetDownloadUrlForLayer",
  12. "ecr:GetRepositoryPolicy",
  13. "ecr:DescribeRepositories",
  14. "ecr:ListImages",
  15. "ecr:BatchGetImage"
  16. ],
  17. "Resource": "*"
  18. }
  19. ]
  20. }

2. 创建 ClusterID

以下资源需要使用 ClusterID 进行标记:

  • Nodes:Rancher 中添加的所有主机。
  • Subnet:集群使用的子网。
  • Security Group:用于你的集群的安全组。

设置 Amazon 云提供商 - 图2备注

不要标记多个安全组。创建弹性负载均衡器 (ELB) 时,标记多个组会产生错误。

创建 Amazon EC2 集群时,会自动为创建的节点配置 ClusterID。其他资源仍然需要手动标记。

使用以下标签:

Key = kubernetes.io/cluster/CLUSTERID Value = owned

CLUSTERID 可以是任何字符串,只要它在所有标签集中相同即可。

将标签的值设置为 owned 会通知集群带有该标签的所有资源都由该集群拥有和管理。如果你在集群之间共享资源,你可以将标签更改为:

Key = kubernetes.io/cluster/CLUSTERID Value = shared.

使用 Amazon Elastic Container Registry (ECR)

在将创建 IAM 角色并附加到实例中的 IAM 配置文件附加到实例时,kubelet 组件能够自动获取 ECR 凭证。使用低于 v1.15.0 的 Kubernetes 版本时,需要在集群中配置 Amazon 云提供商。从 Kubernetes 版本 v1.15.0 开始,kubelet 无需在集群中配置 Amazon 云提供商即可获取 ECR 凭证。

Using the Out-of-Tree AWS Cloud Provider

  • RKE2
  • RKE
  1. Node name conventions and other prerequisites must be followed for the cloud provider to find the instance correctly.

  2. Rancher managed RKE2/K3s clusters don’t support configuring providerID. However, the engine will set the node name correctly if the following configuration is set on the provisioning cluster object:

  1. spec:
  2. rkeConfig:
  3. machineGlobalConfig:
  4. cloud-provider-name: aws

This option will be passed to the configuration of the various Kubernetes components that run on the node, and must be overridden per component to prevent the in-tree provider from running unintentionally:

Override on Etcd:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. kubelet-arg:
  6. - cloud-provider=external
  7. machineLabelSelector:
  8. matchExpressions:
  9. - key: rke.cattle.io/etcd-role
  10. operator: In
  11. values:
  12. - 'true'

Override on Control Plane:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. disable-cloud-controller: true
  6. kube-apiserver-arg:
  7. - cloud-provider=external
  8. kube-controller-manager-arg:
  9. - cloud-provider=external
  10. kubelet-arg:
  11. - cloud-provider=external
  12. machineLabelSelector:
  13. matchExpressions:
  14. - key: rke.cattle.io/control-plane-role
  15. operator: In
  16. values:
  17. - 'true'

Override on Worker:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. kubelet-arg:
  6. - cloud-provider=external
  7. machineLabelSelector:
  8. matchExpressions:
  9. - key: rke.cattle.io/worker-role
  10. operator: In
  11. values:
  12. - 'true'
  1. Select Amazon if relying on the above mechanism to set the provider ID. Otherwise, select External (out-of-tree) cloud provider, which sets --cloud-provider=external for Kubernetes components.

  2. Specify the aws-cloud-controller-manager Helm chart as an additional manifest to install:

  1. spec:
  2. rkeConfig:
  3. additionalManifest: |-
  4. apiVersion: helm.cattle.io/v1
  5. kind: HelmChart
  6. metadata:
  7. name: aws-cloud-controller-manager
  8. namespace: kube-system
  9. spec:
  10. chart: aws-cloud-controller-manager
  11. repo: https://kubernetes.github.io/cloud-provider-aws
  12. targetNamespace: kube-system
  13. bootstrap: true
  14. valuesContent: |-
  15. hostNetworking: true
  16. nodeSelector:
  17. node-role.kubernetes.io/control-plane: "true"
  18. args:
  19. - --configure-cloud-routes=false
  20. - --v=5
  21. - --cloud-provider=aws
  1. Node name conventions and other prerequisites must be followed so that the cloud provider can find the instance. Rancher provisioned clusters don’t support configuring providerID.

设置 Amazon 云提供商 - 图3备注

If you use IP-based naming, the nodes must be named after the instance followed by the regional domain name (ip-xxx-xxx-xxx-xxx.ec2.<region>.internal). If you have a custom domain name set in the DHCP options, you must set --hostname-override on kube-proxy and kubelet to match this naming convention.

To meet node naming conventions, Rancher allows setting useInstanceMetadataHostname when the External Amazon cloud provider is selected. Enabling useInstanceMetadataHostname will query ec2 metadata service and set /hostname as hostname-override for kubelet and kube-proxy:

  1. rancher_kubernetes_engine_config:
  2. cloud_provider:
  3. name: external-aws
  4. useInstanceMetadataHostname: true

You must not enable useInstanceMetadataHostname when setting custom values for hostname-override for custom clusters. When you create a custom cluster, add --node-name to the docker run node registration command to set hostname-override — for example, "$(hostname -f)". This can be done manually or by using Show Advanced Options in the Rancher UI to add Node Name.

  1. Select the cloud provider.

Selecting External Amazon (out-of-tree) sets --cloud-provider=external and enables useInstanceMetadataHostname. As mentioned in step 1, enabling useInstanceMetadataHostname will query the EC2 metadata service and set http://169.254.169.254/latest/meta-data/hostname as hostname-override for kubelet and kube-proxy.

设置 Amazon 云提供商 - 图4备注

You must disable useInstanceMetadataHostname when setting a custom node name for custom clusters via node-name.

  1. rancher_kubernetes_engine_config:
  2. cloud_provider:
  3. name: external-aws
  4. useInstanceMetadataHostname: true/false

Existing clusters that use an External cloud provider will set --cloud-provider=external for Kubernetes components but won’t set the node name.

  1. Install the AWS cloud controller manager after the cluster finishes provisioning. Note that the cluster isn’t successfully provisioned and nodes are still in an uninitialized state until you deploy the cloud controller manager. This can be done manually, or via Helm charts in UI.

Refer to the offical AWS upstream documentation for the cloud controller manager.

Helm Chart Installation from CLI

  • RKE2
  • RKE

Official upstream docs for Helm chart installation can be found on GitHub.

  1. Add the Helm repository:
  1. helm repo add aws-cloud-controller-manager https://kubernetes.github.io/cloud-provider-aws
  2. helm repo update
  1. Create a values.yaml file with the following contents to override the default values.yaml:
  1. # values.yaml
  2. hostNetworking: true
  3. tolerations:
  4. - effect: NoSchedule
  5. key: node.cloudprovider.kubernetes.io/uninitialized
  6. value: 'true'
  7. - effect: NoSchedule
  8. value: 'true'
  9. key: node-role.kubernetes.io/control-plane
  10. nodeSelector:
  11. node-role.kubernetes.io/control-plane: 'true'
  12. args:
  13. - --configure-cloud-routes=false
  14. - --use-service-account-credentials=true
  15. - --v=2
  16. - --cloud-provider=aws
  17. clusterRoleRules:
  18. - apiGroups:
  19. - ""
  20. resources:
  21. - events
  22. verbs:
  23. - create
  24. - patch
  25. - update
  26. - apiGroups:
  27. - ""
  28. resources:
  29. - nodes
  30. verbs:
  31. - '*'
  32. - apiGroups:
  33. - ""
  34. resources:
  35. - nodes/status
  36. verbs:
  37. - patch
  38. - apiGroups:
  39. - ""
  40. resources:
  41. - services
  42. verbs:
  43. - list
  44. - patch
  45. - update
  46. - watch
  47. - apiGroups:
  48. - ""
  49. resources:
  50. - services/status
  51. verbs:
  52. - list
  53. - patch
  54. - update
  55. - watch
  56. - apiGroups:
  57. - ''
  58. resources:
  59. - serviceaccounts
  60. verbs:
  61. - create
  62. - get
  63. - apiGroups:
  64. - ""
  65. resources:
  66. - persistentvolumes
  67. verbs:
  68. - get
  69. - list
  70. - update
  71. - watch
  72. - apiGroups:
  73. - ""
  74. resources:
  75. - endpoints
  76. verbs:
  77. - create
  78. - get
  79. - list
  80. - watch
  81. - update
  82. - apiGroups:
  83. - coordination.k8s.io
  84. resources:
  85. - leases
  86. verbs:
  87. - create
  88. - get
  89. - list
  90. - watch
  91. - update
  92. - apiGroups:
  93. - ""
  94. resources:
  95. - serviceaccounts/token
  96. verbs:
  97. - create
  1. Install the Helm chart:
  1. helm upgrade --install aws-cloud-controller-manager aws-cloud-controller-manager/aws-cloud-controller-manager --values values.yaml

Verify that the Helm chart installed successfully:

  1. helm status -n kube-system aws-cloud-controller-manager
  1. (Optional) Verify that the cloud controller manager update succeeded:
  1. kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager

Official upstream docs for Helm chart installation can be found on GitHub.

  1. Add the Helm repository:
  1. helm repo add aws-cloud-controller-manager https://kubernetes.github.io/cloud-provider-aws
  2. helm repo update
  1. Create a values.yaml file with the following contents, to override the default values.yaml:
  1. # values.yaml
  2. hostNetworking: true
  3. tolerations:
  4. - effect: NoSchedule
  5. key: node.cloudprovider.kubernetes.io/uninitialized
  6. value: 'true'
  7. - effect: NoSchedule
  8. value: 'true'
  9. key: node-role.kubernetes.io/controlplane
  10. nodeSelector:
  11. node-role.kubernetes.io/controlplane: 'true'
  12. args:
  13. - --configure-cloud-routes=false
  14. - --use-service-account-credentials=true
  15. - --v=2
  16. - --cloud-provider=aws
  17. clusterRoleRules:
  18. - apiGroups:
  19. - ""
  20. resources:
  21. - events
  22. verbs:
  23. - create
  24. - patch
  25. - update
  26. - apiGroups:
  27. - ""
  28. resources:
  29. - nodes
  30. verbs:
  31. - '*'
  32. - apiGroups:
  33. - ""
  34. resources:
  35. - nodes/status
  36. verbs:
  37. - patch
  38. - apiGroups:
  39. - ""
  40. resources:
  41. - services
  42. verbs:
  43. - list
  44. - patch
  45. - update
  46. - watch
  47. - apiGroups:
  48. - ""
  49. resources:
  50. - services/status
  51. verbs:
  52. - list
  53. - patch
  54. - update
  55. - watch
  56. - apiGroups:
  57. - ''
  58. resources:
  59. - serviceaccounts
  60. verbs:
  61. - create
  62. - get
  63. - apiGroups:
  64. - ""
  65. resources:
  66. - persistentvolumes
  67. verbs:
  68. - get
  69. - list
  70. - update
  71. - watch
  72. - apiGroups:
  73. - ""
  74. resources:
  75. - endpoints
  76. verbs:
  77. - create
  78. - get
  79. - list
  80. - watch
  81. - update
  82. - apiGroups:
  83. - coordination.k8s.io
  84. resources:
  85. - leases
  86. verbs:
  87. - create
  88. - get
  89. - list
  90. - watch
  91. - update
  92. - apiGroups:
  93. - ""
  94. resources:
  95. - serviceaccounts/token
  96. verbs:
  97. - create
  1. Install the Helm chart:
  1. helm upgrade --install aws-cloud-controller-manager -n kube-system aws-cloud-controller-manager/aws-cloud-controller-manager --values values.yaml

Verify that the Helm chart installed successfully:

  1. helm status -n kube-system aws-cloud-controller-manager
  1. If present, edit the Daemonset to remove the default node selector node-role.kubernetes.io/control-plane: "":
  1. kubectl edit daemonset aws-cloud-controller-manager -n kube-system
  1. (Optional) Verify that the cloud controller manager update succeeded:
  1. kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager

Helm Chart Installation from UI

  • RKE2
  • RKE
  1. Click , then select the name of the cluster from the left navigation.

  2. Select Apps > Repositories.

  3. Click the Create button.

  4. Enter https://kubernetes.github.io/cloud-provider-aws in the Index URL field.

  5. Select Apps > Charts from the left navigation and install aws-cloud-controller-manager.

  6. Select the namespace, kube-system, and enable Customize Helm options before install.

  7. Add the following container arguments:

  1. - '--use-service-account-credentials=true'
  2. - '--configure-cloud-routes=false'
  1. Add get to verbs for serviceaccounts resources in clusterRoleRules. This allows the cloud controller manager to get service accounts upon startup.
  1. - apiGroups:
  2. - ''
  3. resources:
  4. - serviceaccounts
  5. verbs:
  6. - create
  7. - get
  1. Rancher-provisioned RKE2 nodes are tainted node-role.kubernetes.io/control-plane. Update tolerations and the nodeSelector:
  1. tolerations:
  2. - effect: NoSchedule
  3. key: node.cloudprovider.kubernetes.io/uninitialized
  4. value: 'true'
  5. - effect: NoSchedule
  6. value: 'true'
  7. key: node-role.kubernetes.io/control-plane
  1. nodeSelector:
  2. node-role.kubernetes.io/control-plane: 'true'

设置 Amazon 云提供商 - 图5备注

There’s currently a known issue where nodeSelector can’t be updated from the Rancher UI. Continue installing the chart and then edit the Daemonset manually to set the nodeSelector:

  1. nodeSelector:
  2. node-role.kubernetes.io/control-plane: 'true'
  1. Install the chart and confirm that the Daemonset aws-cloud-controller-manager is running. Verify aws-cloud-controller-manager pods are running in target namespace (kube-system unless modified in step 6).

  2. Click , then select the name of the cluster from the left navigation.

  3. Select Apps > Repositories.

  4. Click the Create button.

  5. Enter https://kubernetes.github.io/cloud-provider-aws in the Index URL field.

  6. Select Apps > Charts from the left navigation and install aws-cloud-controller-manager.

  7. Select the namespace, kube-system, and enable Customize Helm options before install.

  8. Add the following container arguments:

  1. - '--use-service-account-credentials=true'
  2. - '--configure-cloud-routes=false'
  1. Add get to verbs for serviceaccounts resources in clusterRoleRules. This allows the cloud controller manager to get service accounts upon startup:
  1. - apiGroups:
  2. - ''
  3. resources:
  4. - serviceaccounts
  5. verbs:
  6. - create
  7. - get
  1. Rancher-provisioned RKE nodes are tainted node-role.kubernetes.io/controlplane. Update tolerations and the nodeSelector:
  1. tolerations:
  2. - effect: NoSchedule
  3. key: node.cloudprovider.kubernetes.io/uninitialized
  4. value: 'true'
  5. - effect: NoSchedule
  6. value: 'true'
  7. key: node-role.kubernetes.io/controlplane
  1. nodeSelector:
  2. node-role.kubernetes.io/controlplane: 'true'

设置 Amazon 云提供商 - 图6备注

There’s currently a known issue where nodeSelector can’t be updated from the Rancher UI. Continue installing the chart and then Daemonset manually to set the nodeSelector:

  1. nodeSelector:
  2. node-role.kubernetes.io/controlplane: 'true'
  1. Install the chart and confirm that the Daemonset aws-cloud-controller-manager deploys successfully:
  1. kubectl rollout status deployment -n kube-system aws-cloud-controller-manager