Setting up the Azure Cloud Provider

Setting up the Azure Cloud Provider - 图1Important:

In Kubernetes 1.30 and later, you must use an out-of-tree Azure cloud provider. The Azure cloud provider has been removed completely, and won’t work after an upgrade to Kubernetes 1.30. The steps listed below are still required to set up an Azure cloud provider. You can set up an out-of-tree cloud provider after completing the prerequisites for Azure.

You can also migrate from an in-tree to an out-of-tree Azure cloud provider on Kubernetes 1.29 and earlier. All existing clusters must migrate prior to upgrading to v1.30 in order to stay functional.

Starting with Kubernetes 1.29, in-tree cloud providers have been disabled. You must disable DisableCloudProviders and DisableKubeletCloudCredentialProvider to use the in-tree Azure cloud provider. You can do this by setting feature-gates=DisableCloudProviders=false as an additional argument for the cluster’s Kubelet, Controller Manager, and API Server in the advanced cluster configuration. Additionally, set DisableKubeletCloudCredentialProvider=false in the Kubelet’s arguments to enable in-tree functionality for authenticating to Azure container registries for image pull credentials. See upstream docs for more details.

Starting with Kubernetes version 1.26, in-tree persistent volume types kubernetes.io/azure-disk and kubernetes.io/azure-file are deprecated and will no longer be supported. For new clusters, install the CSI drivers, or migrate to the corresponding CSI drivers disk.csi.azure.com and file.csi.azure.com by following the upstream migration documentation.

When using the Azure cloud provider, you can leverage the following capabilities:

  • Load Balancers: Launches an Azure Load Balancer within a specific Network Security Group.

  • Persistent Volumes: Supports using Azure Blob disks and Azure Managed Disks with standard and premium storage accounts.

  • Network Storage: Support Azure Files via CIFS mounts.

The following account types are not supported for Azure Subscriptions:

  • Single tenant accounts (i.e. accounts with no subscriptions).
  • Multi-subscription accounts.

Prerequisites for RKE and RKE2

To set up the Azure cloud provider for both RKE and RKE2, the following credentials need to be configured:

  1. Set up the Azure Tenant ID
  2. Set up the Azure Client ID and Azure Client Secret
  3. Configure App Registration Permissions
  4. Set up Azure Network Security Group Name

1. Set up the Azure Tenant ID

Visit Azure portal, login and go to Azure Active Directory and select Properties. Your Directory ID is your Tenant ID (tenantID).

If you want to use the Azure CLI, you can run the command az account show to get the information.

2. Set up the Azure Client ID and Azure Client Secret

Visit Azure portal, login and follow the steps below to create an App Registration and the corresponding Azure Client ID (aadClientId) and Azure Client Secret (aadClientSecret).

  1. Select Azure Active Directory.
  2. Select App registrations.
  3. Select New application registration.
  4. Choose a Name, select Web app / API as Application Type and a Sign-on URL which can be anything in this case.
  5. Select Create.

In the App registrations view, you should see your created App registration. The value shown in the column APPLICATION ID is what you need to use as Azure Client ID.

The next step is to generate the Azure Client Secret:

  1. Open your created App registration.
  2. In the Settings view, open Keys.
  3. Enter a Key description, select an expiration time and select Save.
  4. The generated value shown in the column Value is what you need to use as Azure Client Secret. This value will only be shown once.

3. Configure App Registration Permissions

The last thing you will need to do, is assign the appropriate permissions to your App registration.

  1. Go to More services, search for Subscriptions and open it.
  2. Open Access control (IAM).
  3. Select Add.
  4. For Role, select Contributor.
  5. For Select, select your created App registration name.
  6. Select Save.

4. Set up Azure Network Security Group Name

A custom Azure Network Security Group (securityGroupName) is needed to allow Azure Load Balancers to work.

If you provision hosts using Rancher Machine Azure driver, you will need to edit them manually to assign them to this Network Security Group.

You should already assign custom hosts to this Network Security Group during provisioning.

Only hosts expected to be load balancer back ends need to be in this group.

RKE2 Cluster Set-up in Rancher

Setting up the Azure Cloud Provider - 图2Important:

This section is valid only for creating clusters with the in-tree cloud provider.

  1. Choose “Azure” from the Cloud Provider drop-down in the Cluster Configuration section.

  2. Supply the Cloud Provider Configuration. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster.

    • Click Show Advanced to view or edit these automatically generated names. Your Cloud Provider Configuration must match the fields in the Machine Pools section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group.
    • An example is provided below. Modify it as needed.

    Example Cloud Provider Config

    1. {
    2. "cloud":"AzurePublicCloud",
    3. "tenantId": "YOUR TENANTID HERE",
    4. "aadClientId": "YOUR AADCLIENTID HERE",
    5. "aadClientSecret": "YOUR AADCLIENTSECRET HERE",
    6. "subscriptionId": "YOUR SUBSCRIPTIONID HERE",
    7. "resourceGroup": "docker-machine",
    8. "location": "westus",
    9. "subnetName": "docker-machine",
    10. "securityGroupName": "rancher-managed-KA4jV9V2",
    11. "securityGroupResourceGroup": "docker-machine",
    12. "vnetName": "docker-machine-vnet",
    13. "vnetResourceGroup": "docker-machine",
    14. "primaryAvailabilitySetName": "docker-machine",
    15. "routeTableResourceGroup": "docker-machine",
    16. "cloudProviderBackoff": false,
    17. "useManagedIdentityExtension": false,
    18. "useInstanceMetadata": true
    19. }
  3. Under the Cluster Configuration > Advanced section, click Add under Additional Controller Manager Args and add this flag: --configure-cloud-routes=false

  4. Click Create to submit the form and create the cluster.

Cloud Provider Configuration

Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you will need to specify them before creating the cluster. You can check RKE1 Node Templates or RKE2 Machine Pools to view or edit these automatically generated names.

Refer to the full list of configuration options in the upstream docs.

Setting up the Azure Cloud Provider - 图3note

  1. useInstanceMetadata must be set to true for the cloud provider to correctly configure providerID.
  2. excludeMasterFromStandardLB must be set to false if you need to add nodes labeled node-role.kubernetes.io/master to the backend of the Azure Load Balancer (ALB).
  3. loadBalancerSku can be set to basic or standard. Basic SKU will be deprecated in September 2025. Refer to the Azure upstream docs for more information.

Azure supports reading the cloud config from Kubernetes secrets. The secret is a serialized version of the azure.json file. When the secret is changed, the cloud controller manager reconstructs itself without restarting the pod. It is recommended for the Helm chart to read the Cloud Provider Config from the secret.

Note that the chart reads the Cloud Provider Config from a given secret name in the kube-system namespace. Since Azure reads Kubernetes secrets, RBAC also needs to be configured. An example secret for the Cloud Provider Config is shown below. Modify it as needed and create the secret.

  1. # azure-cloud-config.yaml
  2. apiVersion: v1
  3. kind: Secret
  4. metadata:
  5. name: azure-cloud-config
  6. namespace: kube-system
  7. type: Opaque
  8. stringData:
  9. cloud-config: |-
  10. {
  11. "cloud": "AzurePublicCloud",
  12. "tenantId": "<tenant-id>",
  13. "subscriptionId": "<subscription-id>",
  14. "aadClientId": "<client-id>",
  15. "aadClientSecret": "<tenant-id>",
  16. "resourceGroup": "docker-machine",
  17. "location": "westus",
  18. "subnetName": "docker-machine",
  19. "securityGroupName": "rancher-managed-kqmtsjgJ",
  20. "securityGroupResourceGroup": "docker-machine",
  21. "vnetName": "docker-machine-vnet",
  22. "vnetResourceGroup": "docker-machine",
  23. "primaryAvailabilitySetName": "docker-machine",
  24. "routeTableResourceGroup": "docker-machine",
  25. "cloudProviderBackoff": false,
  26. "useManagedIdentityExtension": false,
  27. "useInstanceMetadata": true,
  28. "loadBalancerSku": "standard",
  29. "excludeMasterFromStandardLB": false,
  30. }
  31. ---
  32. apiVersion: rbac.authorization.k8s.io/v1beta1
  33. kind: ClusterRole
  34. metadata:
  35. labels:
  36. kubernetes.io/cluster-service: "true"
  37. name: system:azure-cloud-provider-secret-getter
  38. rules:
  39. - apiGroups: [""]
  40. resources: ["secrets"]
  41. resourceNames: ["azure-cloud-config"]
  42. verbs:
  43. - get
  44. ---
  45. apiVersion: rbac.authorization.k8s.io/v1beta1
  46. kind: ClusterRoleBinding
  47. metadata:
  48. labels:
  49. kubernetes.io/cluster-service: "true"
  50. name: system:azure-cloud-provider-secret-getter
  51. roleRef:
  52. apiGroup: rbac.authorization.k8s.io
  53. kind: ClusterRole
  54. name: system:azure-cloud-provider-secret-getter
  55. subjects:
  56. - kind: ServiceAccount
  57. name: azure-cloud-config
  58. namespace: kube-system

Using the Out-of-tree Azure Cloud Provider

  • RKE2
  • RKE1
  1. Select External from the Cloud Provider drop-down in the Cluster Configuration section.

  2. Prepare the Cloud Provider Configuration to set it in the next step. Note that Rancher automatically creates a new Network Security Group, Resource Group, Availability Set, Subnet, and Virtual Network. If you already have some or all of these created, you must specify them before creating the cluster.

  • Click Show Advanced to view or edit these automatically generated names. Your Cloud Provider Configuration must match the fields in the Machine Pools section. If you have multiple pools, they must all use the same Resource Group, Availability Set, Subnet, Virtual Network, and Network Security Group.
  1. Under Cluster Configuration > Advanced, click Add under Additional Controller Manager Args and add this flag: --configure-cloud-routes=false.

Note that the chart reads the Cloud Provider Config from the secret in the kube-system namespace. An example secret for the Cloud Provider Config is shown below. Modify it as needed. Refer to the full list of configuration options in the upstream docs.

  1. apiVersion: helm.cattle.io/v1
  2. kind: HelmChart
  3. metadata:
  4. name: azure-cloud-controller-manager
  5. namespace: kube-system
  6. spec:
  7. chart: cloud-provider-azure
  8. repo: https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo
  9. targetNamespace: kube-system
  10. bootstrap: true
  11. valuesContent: |-
  12. infra:
  13. clusterName: <cluster-name>
  14. cloudControllerManager:
  15. cloudConfigSecretName: azure-cloud-config
  16. cloudConfig: null
  17. clusterCIDR: null
  18. enableDynamicReloading: 'true'
  19. nodeSelector:
  20. node-role.kubernetes.io/control-plane: 'true'
  21. allocateNodeCidrs: 'false'
  22. hostNetworking: true
  23. caCertDir: /etc/ssl
  24. configureCloudRoutes: 'false'
  25. enabled: true
  26. tolerations:
  27. - effect: NoSchedule
  28. key: node-role.kubernetes.io/master
  29. - effect: NoSchedule
  30. key: node-role.kubernetes.io/control-plane
  31. value: 'true'
  32. - effect: NoSchedule
  33. key: node.cloudprovider.kubernetes.io/uninitialized
  34. value: 'true'
  35. ---
  36. apiVersion: v1
  37. kind: Secret
  38. metadata:
  39. name: azure-cloud-config
  40. namespace: kube-system
  41. type: Opaque
  42. stringData:
  43. cloud-config: |-
  44. {
  45. "cloud": "AzurePublicCloud",
  46. "tenantId": "<tenant-id>",
  47. "subscriptionId": "<subscription-id>",
  48. "aadClientId": "<client-id>",
  49. "aadClientSecret": "<tenant-id>",
  50. "resourceGroup": "docker-machine",
  51. "location": "westus",
  52. "subnetName": "docker-machine",
  53. "securityGroupName": "rancher-managed-kqmtsjgJ",
  54. "securityGroupResourceGroup": "docker-machine",
  55. "vnetName": "docker-machine-vnet",
  56. "vnetResourceGroup": "docker-machine",
  57. "primaryAvailabilitySetName": "docker-machine",
  58. "routeTableResourceGroup": "docker-machine",
  59. "cloudProviderBackoff": false,
  60. "useManagedIdentityExtension": false,
  61. "useInstanceMetadata": true,
  62. "loadBalancerSku": "standard",
  63. "excludeMasterFromStandardLB": false,
  64. }
  65. ---
  66. apiVersion: rbac.authorization.k8s.io/v1beta1
  67. kind: ClusterRole
  68. metadata:
  69. labels:
  70. kubernetes.io/cluster-service: "true"
  71. name: system:azure-cloud-provider-secret-getter
  72. rules:
  73. - apiGroups: [""]
  74. resources: ["secrets"]
  75. resourceNames: ["azure-cloud-config"]
  76. verbs:
  77. - get
  78. ---
  79. apiVersion: rbac.authorization.k8s.io/v1beta1
  80. kind: ClusterRoleBinding
  81. metadata:
  82. labels:
  83. kubernetes.io/cluster-service: "true"
  84. name: system:azure-cloud-provider-secret-getter
  85. roleRef:
  86. apiGroup: rbac.authorization.k8s.io
  87. kind: ClusterRole
  88. name: system:azure-cloud-provider-secret-getter
  89. subjects:
  90. - kind: ServiceAccount
  91. name: azure-cloud-config
  92. namespace: kube-system
  1. Click Create to submit the form and create the cluster.

  2. Choose External from the Cloud Provider drop-down in the Cluster Options section. This sets --cloud-provider=external for Kubernetes components.

  3. Install the cloud-provider-azure chart after the cluster finishes provisioning. Note that the cluster is not successfully provisioned and nodes are still in an uninitialized state until you deploy the cloud controller manager. This can be done manually using CLI, or via Helm charts in UI.

Refer to the official Azure upstream documentation for more details on deploying the Cloud Controller Manager.

Helm Chart Installation from CLI

Official upstream docs for Helm chart installation can be found on Github.

  1. Create a azure-cloud-config secret with the required cloud provider config.
  1. kubectl apply -f azure-cloud-config.yaml
  1. Add the Helm repository:
  1. helm repo add azure-cloud-controller-manager https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo
  2. helm repo update
  1. Create a values.yaml file with the following contents to override the default values.yaml:
  • RKE2
  • RKE
  1. # values.yaml
  2. infra:
  3. clusterName: <cluster-name>
  4. cloudControllerManager:
  5. cloudConfigSecretName: azure-cloud-config
  6. cloudConfig: null
  7. clusterCIDR: null
  8. enableDynamicReloading: 'true'
  9. configureCloudRoutes: 'false'
  10. allocateNodeCidrs: 'false'
  11. caCertDir: /etc/ssl
  12. enabled: true
  13. replicas: 1
  14. hostNetworking: true
  15. nodeSelector:
  16. node-role.kubernetes.io/control-plane: 'true'
  17. tolerations:
  18. - effect: NoSchedule
  19. key: node-role.kubernetes.io/master
  20. - effect: NoSchedule
  21. key: node-role.kubernetes.io/control-plane
  22. value: 'true'
  23. - effect: NoSchedule
  24. key: node.cloudprovider.kubernetes.io/uninitialized
  25. value: 'true'
  1. # values.yaml
  2. cloudControllerManager:
  3. cloudConfigSecretName: azure-cloud-config
  4. cloudConfig: null
  5. clusterCIDR: null
  6. enableDynamicReloading: 'true'
  7. configureCloudRoutes: 'false'
  8. allocateNodeCidrs: 'false'
  9. caCertDir: /etc/ssl
  10. enabled: true
  11. replicas: 1
  12. hostNetworking: true
  13. nodeSelector:
  14. node-role.kubernetes.io/controlplane: 'true'
  15. node-role.kubernetes.io/control-plane: null
  16. tolerations:
  17. - effect: NoSchedule
  18. key: node-role.kubernetes.io/controlplane
  19. value: 'true'
  20. - effect: NoSchedule
  21. key: node.cloudprovider.kubernetes.io/uninitialized
  22. value: 'true'
  23. infra:
  24. clusterName: <cluster-name>
  1. Install the Helm chart:
  1. helm upgrade --install cloud-provider-azure azure-cloud-controller-manager/cloud-provider-azure -n kube-system --values values.yaml

Verify that the Helm chart installed successfully:

  1. helm status cloud-provider-azure -n kube-system
  1. (Optional) Verify that the cloud controller manager update succeeded:
  1. kubectl rollout status deployment -n kube-system cloud-controller-manager
  2. kubectl rollout status daemonset -n kube-system cloud-node-manager
  1. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID:
  1. kubectl describe nodes | grep "ProviderID"

Helm Chart Installation from UI

  1. Click , then select the name of the cluster from the left navigation.

  2. Select Apps > Repositories.

  3. Click the Create button.

  4. Enter https://raw.githubusercontent.com/kubernetes-sigs/cloud-provider-azure/master/helm/repo in the Index URL field.

  5. Select Apps > Charts from the left navigation and install cloud-provider-azure chart.

  6. Select the namespace, kube-system, and enable Customize Helm options before install.

  7. Replace cloudConfig: /etc/kubernetes/azure.json to read from the Cloud Config Secret and enable dynamic reloading:

  1. cloudConfigSecretName: azure-cloud-config
  2. enableDynamicReloading: 'true'
  1. Update the following fields as required:
  1. allocateNodeCidrs: 'false'
  2. configureCloudRoutes: 'false'
  3. clusterCIDR: null
  • RKE2
  • RKE
  1. Rancher-provisioned RKE2 nodes have the selector node-role.kubernetes.io/control-plane set to true. Update the nodeSelector:
  1. nodeSelector:
  2. node-role.kubernetes.io/control-plane: 'true'
  1. Rancher-provisioned RKE nodes are tainted node-role.kubernetes.io/controlplane. Update tolerations and the nodeSelector:
  1. tolerations:
  2. - effect: NoSchedule
  3. key: node.cloudprovider.kubernetes.io/uninitialized
  4. value: 'true'
  5. - effect: NoSchedule
  6. value: 'true'
  7. key: node-role.kubernetes.io/controlplane
  1. nodeSelector:
  2. node-role.kubernetes.io/controlplane: 'true'
  1. Install the chart and confirm that the cloud controller and cloud node manager deployed successfully:
  1. kubectl rollout status deployment -n kube-system cloud-controller-manager
  2. kubectl rollout status daemonset -n kube-system cloud-node-manager
  1. The cloud provider is responsible for setting the ProviderID of the node. Check if all nodes are initialized with the ProviderID:
  1. kubectl describe nodes | grep "ProviderID"

Installing CSI Drivers

Install Azure Disk CSI driver or Azure File CSI Driver to access Azure Disk or Azure File volumes respectively.

The steps to install the Azure Disk CSI driver are shown below. You can install the Azure File CSI Driver in a similar manner by following the helm installation documentation.

::: note Important:

Clusters must be provisioned using Managed Disk to use Azure Disk. You can configure this when creating RKE1 Node Templates or *RKE2 Machine Pools.

:::

Official upstream docs for Helm chart installation can be found on Github.

  1. Add and update the helm repository:
  1. helm repo add azuredisk-csi-driver https://raw.githubusercontent.com/kubernetes-sigs/azuredisk-csi-driver/master/charts
  2. helm repo update azuredisk-csi-driver
  1. Install the chart as shown below, updating the —version argument as needed. Refer to the full list of latest chart configurations in the upstream docs.
  1. helm install azuredisk-csi-driver azuredisk-csi-driver/azuredisk-csi-driver --namespace kube-system --version v1.30.1 --set controller.cloudConfigSecretName=azure-cloud-config --set controller.cloudConfigSecretNamespace=kube-system --set controller.runOnControlPlane=true
  1. (Optional) Verify that the azuredisk-csi-driver installation succeeded:
  1. kubectl --namespace=kube-system get pods --selector="app.kubernetes.io/name=azuredisk-csi-driver" --watch
  1. Provision an example Storage Class:
  1. cat <<EOF | kubectl create -f -
  2. kind: StorageClass
  3. apiVersion: storage.k8s.io/v1
  4. metadata:
  5. name: standard
  6. provisioner: kubernetes.io/azure-disk
  7. parameters:
  8. storageaccounttype: Standard_LRS
  9. kind: Managed
  10. EOF

Verify that the storage class has been provisioned:

  1. kubectl get storageclasses
  1. Create a PersistentVolumeClaim:
  1. cat <<EOF | kubectl create -f -
  2. kind: PersistentVolumeClaim
  3. apiVersion: v1
  4. metadata:
  5. name: azure-disk-pvc
  6. spec:
  7. storageClassName: standard
  8. accessModes:
  9. - ReadWriteOnce
  10. resources:
  11. requests:
  12. storage: 5Gi
  13. EOF

Verify that the PersistentVolumeClaim and PersistentVolume have been created:

  1. kubectl get persistentvolumeclaim
  2. kubectl get persistentvolume
  1. Attach the new Azure Disk:

You can now mount the Kubernetes PersistentVolume into a Kubernetes Pod. The disk can be consumed by any Kubernetes object type, including a Deployment, DaemonSet, or StatefulSet. However, the following example simply mounts the PersistentVolume into a standalone Pod.

  1. cat <<EOF | kubectl create -f -
  2. kind: Pod
  3. apiVersion: v1
  4. metadata:
  5. name: mypod-dynamic-azuredisk
  6. spec:
  7. containers:
  8. - name: mypod
  9. image: nginx
  10. ports:
  11. - containerPort: 80
  12. name: "http-server"
  13. volumeMounts:
  14. - mountPath: "/usr/share/nginx/html"
  15. name: storage
  16. volumes:
  17. - name: storage
  18. persistentVolumeClaim:
  19. claimName: azure-disk-pvc
  20. EOF