Setting up the Amazon Cloud Provider

Setting up the Amazon Cloud Provider - 图1Important:

In Kubernetes 1.27 and later, you must use an out-of-tree AWS cloud provider. In-tree cloud providers have been deprecated. The Amazon cloud provider has been removed completely, and won’t work after an upgrade to Kubernetes 1.27. The steps listed below are still required to set up an Amazon cloud provider. You can set up an out-of-tree cloud provider after creating an IAM role and configuring the ClusterID.

You can also migrate from an in-tree to an out-of-tree AWS cloud provider on Kubernetes 1.26 and earlier. All existing clusters must migrate prior to upgrading to v1.27 in order to stay functional.

Starting with Kubernetes 1.23, you must deactivate the CSIMigrationAWS feature gate to use the in-tree AWS cloud provider. You can do this by setting feature-gates=CSIMigrationAWS=false as an additional argument for the cluster’s Kubelet, Controller Manager, API Server and Scheduler in the advanced cluster configuration.

When you use Amazon as a cloud provider, you can leverage the following capabilities:

  • Load Balancers: Launch an AWS Elastic Load Balancer (ELB) when you select Layer-4 Load Balancer in Port Mapping or when you launch a Service with type: LoadBalancer.
  • Persistent Volumes: Use AWS Elastic Block Stores (EBS) for persistent volumes.

See the cloud-provider-aws README for more information about the Amazon cloud provider.

To set up the Amazon cloud provider,

  1. Create an IAM role and attach to the instances
  2. Configure the ClusterID

1. Create an IAM Role and attach to the instances

All nodes added to the cluster must be able to interact with EC2 so that they can create and remove resources. You can enable this interaction by using an IAM role attached to the instance. See Amazon documentation: Creating an IAM Role how to create an IAM role. There are two example policies:

  • The first policy is for the nodes with the controlplane role. These nodes have to be able to create/remove EC2 resources. The following IAM policy is an example, please remove any unneeded permissions for your use case.
  • The second policy is for the nodes with the etcd or worker role. These nodes only have to be able to retrieve information from EC2.

While creating an Amazon EC2 cluster, you must fill in the IAM Instance Profile Name (not ARN) of the created IAM role when creating the Node Template.

While creating a Custom cluster, you must manually attach the IAM role to the instance(s).

IAM Policy for nodes with the controlplane role:

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "autoscaling:DescribeAutoScalingGroups",
  8. "autoscaling:DescribeLaunchConfigurations",
  9. "autoscaling:DescribeTags",
  10. "ec2:DescribeInstances",
  11. "ec2:DescribeRegions",
  12. "ec2:DescribeRouteTables",
  13. "ec2:DescribeSecurityGroups",
  14. "ec2:DescribeSubnets",
  15. "ec2:DescribeVolumes",
  16. "ec2:CreateSecurityGroup",
  17. "ec2:CreateTags",
  18. "ec2:CreateVolume",
  19. "ec2:ModifyInstanceAttribute",
  20. "ec2:ModifyVolume",
  21. "ec2:AttachVolume",
  22. "ec2:AuthorizeSecurityGroupIngress",
  23. "ec2:CreateRoute",
  24. "ec2:DeleteRoute",
  25. "ec2:DeleteSecurityGroup",
  26. "ec2:DeleteVolume",
  27. "ec2:DetachVolume",
  28. "ec2:RevokeSecurityGroupIngress",
  29. "ec2:DescribeVpcs",
  30. "elasticloadbalancing:AddTags",
  31. "elasticloadbalancing:AttachLoadBalancerToSubnets",
  32. "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",
  33. "elasticloadbalancing:CreateLoadBalancer",
  34. "elasticloadbalancing:CreateLoadBalancerPolicy",
  35. "elasticloadbalancing:CreateLoadBalancerListeners",
  36. "elasticloadbalancing:ConfigureHealthCheck",
  37. "elasticloadbalancing:DeleteLoadBalancer",
  38. "elasticloadbalancing:DeleteLoadBalancerListeners",
  39. "elasticloadbalancing:DescribeLoadBalancers",
  40. "elasticloadbalancing:DescribeLoadBalancerAttributes",
  41. "elasticloadbalancing:DetachLoadBalancerFromSubnets",
  42. "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
  43. "elasticloadbalancing:ModifyLoadBalancerAttributes",
  44. "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
  45. "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer",
  46. "elasticloadbalancing:AddTags",
  47. "elasticloadbalancing:CreateListener",
  48. "elasticloadbalancing:CreateTargetGroup",
  49. "elasticloadbalancing:DeleteListener",
  50. "elasticloadbalancing:DeleteTargetGroup",
  51. "elasticloadbalancing:DescribeListeners",
  52. "elasticloadbalancing:DescribeLoadBalancerPolicies",
  53. "elasticloadbalancing:DescribeTargetGroups",
  54. "elasticloadbalancing:DescribeTargetHealth",
  55. "elasticloadbalancing:ModifyListener",
  56. "elasticloadbalancing:ModifyTargetGroup",
  57. "elasticloadbalancing:RegisterTargets",
  58. "elasticloadbalancing:SetLoadBalancerPoliciesOfListener",
  59. "iam:CreateServiceLinkedRole",
  60. "kms:DescribeKey"
  61. ],
  62. "Resource": [
  63. "*"
  64. ]
  65. }
  66. ]
  67. }

IAM policy for nodes with the etcd or worker role:

  1. {
  2. "Version": "2012-10-17",
  3. "Statement": [
  4. {
  5. "Effect": "Allow",
  6. "Action": [
  7. "ec2:DescribeInstances",
  8. "ec2:DescribeRegions",
  9. "ecr:GetAuthorizationToken",
  10. "ecr:BatchCheckLayerAvailability",
  11. "ecr:GetDownloadUrlForLayer",
  12. "ecr:GetRepositoryPolicy",
  13. "ecr:DescribeRepositories",
  14. "ecr:ListImages",
  15. "ecr:BatchGetImage"
  16. ],
  17. "Resource": "*"
  18. }
  19. ]
  20. }

2. Configure the ClusterID

The following resources need to tagged with a ClusterID:

  • Nodes: All hosts added in Rancher.
  • Subnet: The subnet used for your cluster.
  • Security Group: The security group used for your cluster.

Setting up the Amazon Cloud Provider - 图2note

Do not tag multiple security groups. Tagging multiple groups generates an error when creating an Elastic Load Balancer (ELB).

When you create an Amazon EC2 Cluster, the ClusterID is automatically configured for the created nodes. Other resources still need to be manually tagged.

Use the following tag:

Key = kubernetes.io/cluster/<cluster-id> Value = owned

Setting the value of the tag to owned tells the cluster that all resources with this tag are owned and managed by this cluster.

If you share resources between clusters, you can change the tag to:

Key = kubernetes.io/cluster/<cluster-id> Value = shared.

The string value, <cluster-id>, is the Kubernetes cluster’s ID.

Setting up the Amazon Cloud Provider - 图3note

Do not tag a resource with multiple owned or shared tags.

Using Amazon Elastic Container Registry (ECR)

The kubelet component has the ability to automatically obtain ECR credentials, when the IAM profile mentioned in Create an IAM Role and attach to the instances is attached to the instance(s). When using a Kubernetes version older than v1.15.0, the Amazon cloud provider needs be configured in the cluster. Starting with Kubernetes version v1.15.0, the kubelet can obtain ECR credentials without having the Amazon cloud provider configured in the cluster.

Using the Out-of-Tree AWS Cloud Provider

  • RKE2
  • RKE
  1. Node name conventions and other prerequisites must be followed for the cloud provider to find the instance correctly.

  2. Rancher managed RKE2/K3s clusters don’t support configuring providerID. However, the engine will set the node name correctly if the following configuration is set on the provisioning cluster object:

  1. spec:
  2. rkeConfig:
  3. machineGlobalConfig:
  4. cloud-provider-name: aws

This option will be passed to the configuration of the various Kubernetes components that run on the node, and must be overridden per component to prevent the in-tree provider from running unintentionally:

Override on Etcd:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. kubelet-arg:
  6. - cloud-provider=external
  7. machineLabelSelector:
  8. matchExpressions:
  9. - key: rke.cattle.io/etcd-role
  10. operator: In
  11. values:
  12. - 'true'

Override on Control Plane:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. disable-cloud-controller: true
  6. kube-apiserver-arg:
  7. - cloud-provider=external
  8. kube-controller-manager-arg:
  9. - cloud-provider=external
  10. kubelet-arg:
  11. - cloud-provider=external
  12. machineLabelSelector:
  13. matchExpressions:
  14. - key: rke.cattle.io/control-plane-role
  15. operator: In
  16. values:
  17. - 'true'

Override on Worker:

  1. spec:
  2. rkeConfig:
  3. machineSelectorConfig:
  4. - config:
  5. kubelet-arg:
  6. - cloud-provider=external
  7. machineLabelSelector:
  8. matchExpressions:
  9. - key: rke.cattle.io/worker-role
  10. operator: In
  11. values:
  12. - 'true'
  1. Select Amazon if relying on the above mechanism to set the provider ID. Otherwise, select External (out-of-tree) cloud provider, which sets --cloud-provider=external for Kubernetes components.

  2. Specify the aws-cloud-controller-manager Helm chart as an additional manifest to install:

  1. spec:
  2. rkeConfig:
  3. additionalManifest: |-
  4. apiVersion: helm.cattle.io/v1
  5. kind: HelmChart
  6. metadata:
  7. name: aws-cloud-controller-manager
  8. namespace: kube-system
  9. spec:
  10. chart: aws-cloud-controller-manager
  11. repo: https://kubernetes.github.io/cloud-provider-aws
  12. targetNamespace: kube-system
  13. bootstrap: true
  14. valuesContent: |-
  15. hostNetworking: true
  16. nodeSelector:
  17. node-role.kubernetes.io/control-plane: "true"
  18. args:
  19. - --configure-cloud-routes=false
  20. - --v=5
  21. - --cloud-provider=aws
  1. Node name conventions and other prerequisites must be followed so that the cloud provider can find the instance. Rancher provisioned clusters don’t support configuring providerID.

Setting up the Amazon Cloud Provider - 图4note

If you use IP-based naming, the nodes must be named after the instance followed by the regional domain name (ip-xxx-xxx-xxx-xxx.ec2.<region>.internal). If you have a custom domain name set in the DHCP options, you must set --hostname-override on kube-proxy and kubelet to match this naming convention.

To meet node naming conventions, Rancher allows setting useInstanceMetadataHostname when the External Amazon cloud provider is selected. Enabling useInstanceMetadataHostname will query ec2 metadata service and set /hostname as hostname-override for kubelet and kube-proxy:

  1. rancher_kubernetes_engine_config:
  2. cloud_provider:
  3. name: external-aws
  4. useInstanceMetadataHostname: true

You must not enable useInstanceMetadataHostname when setting custom values for hostname-override for custom clusters. When you create a custom cluster, add --node-name to the docker run node registration command to set hostname-override — for example, "$(hostname -f)". This can be done manually or by using Show Advanced Options in the Rancher UI to add Node Name.

  1. Select the cloud provider.

Selecting External Amazon (out-of-tree) sets --cloud-provider=external and enables useInstanceMetadataHostname. As mentioned in step 1, enabling useInstanceMetadataHostname will query the EC2 metadata service and set http://169.254.169.254/latest/meta-data/hostname as hostname-override for kubelet and kube-proxy.

Setting up the Amazon Cloud Provider - 图5note

You must disable useInstanceMetadataHostname when setting a custom node name for custom clusters via node-name.

  1. rancher_kubernetes_engine_config:
  2. cloud_provider:
  3. name: external-aws
  4. useInstanceMetadataHostname: true/false

Existing clusters that use an External cloud provider will set --cloud-provider=external for Kubernetes components but won’t set the node name.

  1. Install the AWS cloud controller manager after the cluster finishes provisioning. Note that the cluster isn’t successfully provisioned and nodes are still in an uninitialized state until you deploy the cloud controller manager. This can be done manually, or via Helm charts in UI.

Refer to the offical AWS upstream documentation for the cloud controller manager.

Helm Chart Installation from CLI

  • RKE2
  • RKE

Official upstream docs for Helm chart installation can be found on GitHub.

  1. Add the Helm repository:
  1. helm repo add aws-cloud-controller-manager https://kubernetes.github.io/cloud-provider-aws
  2. helm repo update
  1. Create a values.yaml file with the following contents to override the default values.yaml:
  1. # values.yaml
  2. hostNetworking: true
  3. tolerations:
  4. - effect: NoSchedule
  5. key: node.cloudprovider.kubernetes.io/uninitialized
  6. value: 'true'
  7. - effect: NoSchedule
  8. value: 'true'
  9. key: node-role.kubernetes.io/control-plane
  10. nodeSelector:
  11. node-role.kubernetes.io/control-plane: 'true'
  12. args:
  13. - --configure-cloud-routes=false
  14. - --use-service-account-credentials=true
  15. - --v=2
  16. - --cloud-provider=aws
  17. clusterRoleRules:
  18. - apiGroups:
  19. - ""
  20. resources:
  21. - events
  22. verbs:
  23. - create
  24. - patch
  25. - update
  26. - apiGroups:
  27. - ""
  28. resources:
  29. - nodes
  30. verbs:
  31. - '*'
  32. - apiGroups:
  33. - ""
  34. resources:
  35. - nodes/status
  36. verbs:
  37. - patch
  38. - apiGroups:
  39. - ""
  40. resources:
  41. - services
  42. verbs:
  43. - list
  44. - patch
  45. - update
  46. - watch
  47. - apiGroups:
  48. - ""
  49. resources:
  50. - services/status
  51. verbs:
  52. - list
  53. - patch
  54. - update
  55. - watch
  56. - apiGroups:
  57. - ''
  58. resources:
  59. - serviceaccounts
  60. verbs:
  61. - create
  62. - get
  63. - apiGroups:
  64. - ""
  65. resources:
  66. - persistentvolumes
  67. verbs:
  68. - get
  69. - list
  70. - update
  71. - watch
  72. - apiGroups:
  73. - ""
  74. resources:
  75. - endpoints
  76. verbs:
  77. - create
  78. - get
  79. - list
  80. - watch
  81. - update
  82. - apiGroups:
  83. - coordination.k8s.io
  84. resources:
  85. - leases
  86. verbs:
  87. - create
  88. - get
  89. - list
  90. - watch
  91. - update
  92. - apiGroups:
  93. - ""
  94. resources:
  95. - serviceaccounts/token
  96. verbs:
  97. - create
  1. Install the Helm chart:
  1. helm upgrade --install aws-cloud-controller-manager aws-cloud-controller-manager/aws-cloud-controller-manager --values values.yaml

Verify that the Helm chart installed successfully:

  1. helm status -n kube-system aws-cloud-controller-manager
  1. (Optional) Verify that the cloud controller manager update succeeded:
  1. kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager

Official upstream docs for Helm chart installation can be found on GitHub.

  1. Add the Helm repository:
  1. helm repo add aws-cloud-controller-manager https://kubernetes.github.io/cloud-provider-aws
  2. helm repo update
  1. Create a values.yaml file with the following contents, to override the default values.yaml:
  1. # values.yaml
  2. hostNetworking: true
  3. tolerations:
  4. - effect: NoSchedule
  5. key: node.cloudprovider.kubernetes.io/uninitialized
  6. value: 'true'
  7. - effect: NoSchedule
  8. value: 'true'
  9. key: node-role.kubernetes.io/controlplane
  10. nodeSelector:
  11. node-role.kubernetes.io/controlplane: 'true'
  12. args:
  13. - --configure-cloud-routes=false
  14. - --use-service-account-credentials=true
  15. - --v=2
  16. - --cloud-provider=aws
  17. clusterRoleRules:
  18. - apiGroups:
  19. - ""
  20. resources:
  21. - events
  22. verbs:
  23. - create
  24. - patch
  25. - update
  26. - apiGroups:
  27. - ""
  28. resources:
  29. - nodes
  30. verbs:
  31. - '*'
  32. - apiGroups:
  33. - ""
  34. resources:
  35. - nodes/status
  36. verbs:
  37. - patch
  38. - apiGroups:
  39. - ""
  40. resources:
  41. - services
  42. verbs:
  43. - list
  44. - patch
  45. - update
  46. - watch
  47. - apiGroups:
  48. - ""
  49. resources:
  50. - services/status
  51. verbs:
  52. - list
  53. - patch
  54. - update
  55. - watch
  56. - apiGroups:
  57. - ''
  58. resources:
  59. - serviceaccounts
  60. verbs:
  61. - create
  62. - get
  63. - apiGroups:
  64. - ""
  65. resources:
  66. - persistentvolumes
  67. verbs:
  68. - get
  69. - list
  70. - update
  71. - watch
  72. - apiGroups:
  73. - ""
  74. resources:
  75. - endpoints
  76. verbs:
  77. - create
  78. - get
  79. - list
  80. - watch
  81. - update
  82. - apiGroups:
  83. - coordination.k8s.io
  84. resources:
  85. - leases
  86. verbs:
  87. - create
  88. - get
  89. - list
  90. - watch
  91. - update
  92. - apiGroups:
  93. - ""
  94. resources:
  95. - serviceaccounts/token
  96. verbs:
  97. - create
  1. Install the Helm chart:
  1. helm upgrade --install aws-cloud-controller-manager -n kube-system aws-cloud-controller-manager/aws-cloud-controller-manager --values values.yaml

Verify that the Helm chart installed successfully:

  1. helm status -n kube-system aws-cloud-controller-manager
  1. If present, edit the Daemonset to remove the default node selector node-role.kubernetes.io/control-plane: "":
  1. kubectl edit daemonset aws-cloud-controller-manager -n kube-system
  1. (Optional) Verify that the cloud controller manager update succeeded:
  1. kubectl rollout status daemonset -n kube-system aws-cloud-controller-manager

Helm Chart Installation from UI

  • RKE2
  • RKE
  1. Click , then select the name of the cluster from the left navigation.

  2. Select Apps > Repositories.

  3. Click the Create button.

  4. Enter https://kubernetes.github.io/cloud-provider-aws in the Index URL field.

  5. Select Apps > Charts from the left navigation and install aws-cloud-controller-manager.

  6. Select the namespace, kube-system, and enable Customize Helm options before install.

  7. Add the following container arguments:

  1. - '--use-service-account-credentials=true'
  2. - '--configure-cloud-routes=false'
  1. Add get to verbs for serviceaccounts resources in clusterRoleRules. This allows the cloud controller manager to get service accounts upon startup.
  1. - apiGroups:
  2. - ''
  3. resources:
  4. - serviceaccounts
  5. verbs:
  6. - create
  7. - get
  1. Rancher-provisioned RKE2 nodes are tainted node-role.kubernetes.io/control-plane. Update tolerations and the nodeSelector:
  1. tolerations:
  2. - effect: NoSchedule
  3. key: node.cloudprovider.kubernetes.io/uninitialized
  4. value: 'true'
  5. - effect: NoSchedule
  6. value: 'true'
  7. key: node-role.kubernetes.io/control-plane
  1. nodeSelector:
  2. node-role.kubernetes.io/control-plane: 'true'

Setting up the Amazon Cloud Provider - 图6note

There’s currently a known issue where nodeSelector can’t be updated from the Rancher UI. Continue installing the chart and then edit the Daemonset manually to set the nodeSelector:

  1. nodeSelector:
  2. node-role.kubernetes.io/control-plane: 'true'
  1. Install the chart and confirm that the Daemonset aws-cloud-controller-manager is running. Verify aws-cloud-controller-manager pods are running in target namespace (kube-system unless modified in step 6).

  2. Click , then select the name of the cluster from the left navigation.

  3. Select Apps > Repositories.

  4. Click the Create button.

  5. Enter https://kubernetes.github.io/cloud-provider-aws in the Index URL field.

  6. Select Apps > Charts from the left navigation and install aws-cloud-controller-manager.

  7. Select the namespace, kube-system, and enable Customize Helm options before install.

  8. Add the following container arguments:

  1. - '--use-service-account-credentials=true'
  2. - '--configure-cloud-routes=false'
  1. Add get to verbs for serviceaccounts resources in clusterRoleRules. This allows the cloud controller manager to get service accounts upon startup:
  1. - apiGroups:
  2. - ''
  3. resources:
  4. - serviceaccounts
  5. verbs:
  6. - create
  7. - get
  1. Rancher-provisioned RKE nodes are tainted node-role.kubernetes.io/controlplane. Update tolerations and the nodeSelector:
  1. tolerations:
  2. - effect: NoSchedule
  3. key: node.cloudprovider.kubernetes.io/uninitialized
  4. value: 'true'
  5. - effect: NoSchedule
  6. value: 'true'
  7. key: node-role.kubernetes.io/controlplane
  1. nodeSelector:
  2. node-role.kubernetes.io/controlplane: 'true'

Setting up the Amazon Cloud Provider - 图7note

There’s currently a known issue where nodeSelector can’t be updated from the Rancher UI. Continue installing the chart and then Daemonset manually to set the nodeSelector:

  1. nodeSelector:
  2. node-role.kubernetes.io/controlplane: 'true'
  1. Install the chart and confirm that the Daemonset aws-cloud-controller-manager deploys successfully:
  1. kubectl rollout status deployment -n kube-system aws-cloud-controller-manager