Configuring the cluster auto-scaler in AWS

You can configure an auto-scaler on your OKD cluster in Amazon Web Services (AWS) to provide elasticity for your application workload. The auto-scaler ensures that enough nodes are active to run your pods and that the number of active nodes is proportional to current demand.

You can run the auto-scaler only on AWS.

About the OKD auto-scaler

The auto-scaler in OKD repeatedly checks to see how many pods are pending node allocation. If pods are pending allocation and the auto-scaler has not met its maximum capacity, then new nodes are continuously provisioned to accommodate the current demand. When demand drops and fewer nodes are required, the auto-scaler removes unused nodes. After you install the auto-scaler, its behavior is automatic. You only need to add the desired number of replicas to the deployment.

In OKD version 3.11, you can deploy the auto-scaler only on Amazon Web Services (AWS). The auto-scaler uses some standard AWS objects to manage your cluster size, including Auto Scaling groups and Launch Configurations.

The auto-scaler uses the following assets:

Auto Scaling groups

An Auto Scaling group is a logical representation of a set of machines. You configure an Auto Scaling group with a minimum number of instances to run, the maximum number of instances that can run, and your desired number of instances to run. An Auto Scaling group starts by launching enough instances to meet your desired capacity. You can configure an Auto Scaling group to start with zero instances.

Launch Configurations

A Launch Configuration is a template that an Auto Scaling group uses to launch instances. When you create a Launch Configuration, you specify information such as:

  • The ID of the Amazon Machine Image (AMI) to use as the base image

  • The instance type, such as m4.large

  • A key pair

  • One or more security groups

  • The subnets to apply the Launch Configuration to

OKD primed images

When the Auto Scaling group provisions a new instance, the image that it launches must have OKD already prepared. The Auto Scaling group uses this image to both automatically bootstrap the node and enroll it within the cluster without any manual intervention.

Creating a primed image

You can use Ansible playbooks to automatically create a primed image for the auto-scaler to use. You must provide attributes from your existing Amazon Web Services (AWS) cluster.

If you already have a primed image, you can use it instead of creating a new one.

Procedure

On the host that you used to create your OKD cluster, create a primed image:

  1. Create a new Ansible inventory file on your local host:

    1. [OSEv3:children]
    2. masters
    3. nodes
    4. etcd
    5. [OSEv3:vars]
    6. openshift_deployment_type=origin
    7. ansible_ssh_user=ec2-user
    8. openshift_clusterid=mycluster
    9. ansible_become=yes
    10. [masters]
    11. [etcd]
    12. [nodes]
  2. Create provisioning file, build-ami-provisioning-vars.yaml, on your local host:

    1. openshift_deployment_type: origin
    2. openshift_aws_clusterid: mycluster (1)
    3. openshift_aws_region: us-east-1 (2)
    4. openshift_aws_create_vpc: false (3)
    5. openshift_aws_vpc_name: production (4)
    6. openshift_aws_subnet_az: us-east-1d (5)
    7. openshift_aws_create_security_groups: false (6)
    8. openshift_aws_ssh_key_name: production-ssh-key (7)
    9. openshift_aws_base_ami: ami-12345678 (8)
    10. openshift_aws_create_s3: False (9)
    11. openshift_aws_build_ami_group: default (10)
    12. openshift_aws_vpc: (11)
    13. name: "{{ openshift_aws_vpc_name }}"
    14. cidr: 172.18.0.0/16
    15. subnets:
    16. us-east-1:
    17. - cidr: 172.18.0.0/20
    18. az: "us-east-1d"
    19. container_runtime_docker_storage_type: overlay2 (12)
    20. container_runtime_docker_storage_setup_device: /dev/xvdb (13)
    1Provide the name of the existing cluster.
    2Provide the region the existing cluster is currently running in.
    3Specify False to disable the creation of a VPC.
    4Provide the existing VPC name that the cluster is running in.
    5Provide the name of a subnet the existing cluster is running in.
    6Specify False to disable the creation of security groups.
    7Provide the AWS key name to use for SSH access.
    8Provide the AMI image ID to use as the base image for the primed image. See Red Hat® Cloud Access.
    9Specify False to disable the creation of an S3 bucket.
    10Provide the security group name.
    11Provide the VPC subnets the existing cluster is running in.
    12Specify overlay2 as the Docker storage type.
    13Specify the mount point for LVM and the /var/lib/docker directory.
  3. Run the build_ami.yml playbook to generate a primed image:

    1. # ansible-playbook -i </path/to/inventory/file> \
    2. ~/openshift-ansible/playbooks/aws/openshift-cluster/build_ami.yml \
    3. -e @build-ami-provisioning-vars.yaml

    After the playbook runs, you see a new image ID, or AMI, in its output. You specify the AMI that it generated when you create the Launch Configuration.

Creating the launch configuration and Auto Scaling group

Before you deploy the cluster auto-scaler, you must create an Amazon Web Services (AWS) launch configuration and Auto Scaling group that reference a primed image. You must configure the launch configuration so that the new node automatically joins the existing cluster when it starts.

Prerequisites

  • Install an OKD cluster in AWS.

  • Create a primed image.

  • If you deployed the EFK stack in your cluster, set the node label to logging-infra-fluentd=true.

Procedure

  1. Create the bootstrap.kubeconfig file by generating it from a master node:

    1. $ ssh master "sudo oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper" > ~/bootstrap.kubeconfig
  2. Create the user-data.txt cloud-init file from the bootstrap.kubeconfig file:

    1. $ cat <<EOF > user-data.txt
    2. #cloud-config
    3. write_files:
    4. - path: /root/openshift_bootstrap/openshift_settings.yaml
    5. owner: 'root:root'
    6. permissions: '0640'
    7. content: |
    8. openshift_node_config_name: node-config-compute
    9. - path: /etc/origin/node/bootstrap.kubeconfig
    10. owner: 'root:root'
    11. permissions: '0640'
    12. encoding: b64
    13. content: |
    14. $(base64 ~/bootstrap.kubeconfig | sed '2,$s/^/ /')
    15. runcmd:
    16. - [ ansible-playbook, /root/openshift_bootstrap/bootstrap.yml]
    17. - [ systemctl, restart, systemd-hostnamed]
    18. - [ systemctl, restart, NetworkManager]
    19. - [ systemctl, enable, origin-node]
    20. - [ systemctl, start, origin-node]
    21. EOF
  3. Upload a launch configuration template to an AWS S3 bucket.

  4. Create the launch configuration by using the AWS CLI:

    1. $ aws autoscaling create-launch-configuration \
    2. --launch-configuration-name mycluster-LC \ (1)
    3. --region us-east-1 \ (2)
    4. --image-id ami-987654321 \ (3)
    5. --instance-type m4.large \ (4)
    6. --security-groups sg-12345678 \ (5)
    7. --template-url https://s3-.amazonaws.com/.../yourtemplate.json \ (6)
    8. --key-name production-key \ (7)
    1Specify a launch configuration name.
    2Specify the region to launch the image in.
    3Specify the primed image AMI that you created.
    4Specify the type of instance to launch.
    5Specify the security groups to attach to the launched image.
    6Specify the launch configuration template that you uploaded.
    7Specify the SSH key-pair name.
    If your template is fewer than 16 KB before you encode it, you can provide it using the AWS CLI by substituting —template-url with —user-data.
  5. Create the Auto Scaling group by using the AWS CLI:

    1. $ aws autoscaling create-auto-scaling-group \
    2. --auto-scaling-group-name mycluster-ASG \ (1)
    3. --launch-configuration-name mycluster-LC \ (2)
    4. --min-size 0 \ (3)
    5. --max-size 6 \ (4)
    6. --vpc-zone-identifier subnet-12345678 \ (5)
    7. --tags ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=Name,Value=mycluster-ASG-node,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=kubernetes.io/cluster/mycluster,Value=true,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/node-template/label/node-role.kubernetes.io/compute,Value=true,PropagateAtLaunch=true (6)
    1Specify the name of the Auto Scaling group, which you use when you deploy the auto-scaler deployment
    2Specify the name of the Launch Configuration that you created.
    3Specify the minimum number of nodes that the auto-scaler maintains.
    4Specify the maximum number of nodes the scale group can expand to.
    5Specify the VPC subnet-id, which is the same subnet that the cluster uses.
    6Specify this string to ensure that Auto Scaling group tags are propagated to the nodes when they launch.

Deploying the auto-scaler components on your cluster

After you create the Launch Configuration and Auto Scaling group, you can deploy the auto-scaler components onto the cluster.

Prerequisites

  • Install a OKD cluster in AWS.

  • Create a primed image.

  • Create a Launch Configuration and Auto Scaling group that reference the primed image.

Procedure

To deploy the auto-scaler:

  1. Update your cluster to run the auto-scaler:

    1. Add the following parameter to the inventory file that you used to create the cluster, by default /etc/ansible/hosts:

      1. openshift_master_bootstrap_auto_approve=true
    2. To obtain the auto-scaler components, change to the playbook directory and run the playbook again:

      1. $ cd /usr/share/ansible/openshift-ansible
      2. $ ansible-playbook -i </path/to/inventory/file> \
      3. playbooks/deploy_cluster.yml
    3. Confirm that the bootstrap-autoapprover pod is running:

      1. $ oc get pods --all-namespaces | grep bootstrap-autoapprover
      2. NAMESPACE NAME READY STATUS RESTARTS AGE
      3. openshift-infra bootstrap-autoapprover-0 1/1 Running 0
  1. Create a namespace for the auto-scaler:

    1. $ oc apply -f - <<EOF
    2. apiVersion: v1
    3. kind: Namespace
    4. metadata:
    5. name: cluster-autoscaler
    6. annotations:
    7. openshift.io/node-selector: ""
    8. EOF
  2. Create a service account for the auto-scaler:

    1. $ oc apply -f - <<EOF
    2. apiVersion: v1
    3. kind: ServiceAccount
    4. metadata:
    5. labels:
    6. k8s-addon: cluster-autoscaler.addons.k8s.io
    7. k8s-app: cluster-autoscaler
    8. name: cluster-autoscaler
    9. namespace: cluster-autoscaler
    10. EOF
  3. Create a cluster role to grant the required permissions to the service account:

    1. $ oc apply -n cluster-autoscaler -f - <<EOF
    2. apiVersion: v1
    3. kind: ClusterRole
    4. metadata:
    5. name: cluster-autoscaler
    6. rules:
    7. - apiGroups: (1)
    8. - ""
    9. resources:
    10. - pods/eviction
    11. verbs:
    12. - create
    13. attributeRestrictions: null
    14. - apiGroups:
    15. - ""
    16. resources:
    17. - persistentvolumeclaims
    18. - persistentvolumes
    19. - pods
    20. - replicationcontrollers
    21. - services
    22. verbs:
    23. - get
    24. - list
    25. - watch
    26. attributeRestrictions: null
    27. - apiGroups:
    28. - ""
    29. resources:
    30. - events
    31. verbs:
    32. - get
    33. - list
    34. - watch
    35. - patch
    36. - create
    37. attributeRestrictions: null
    38. - apiGroups:
    39. - ""
    40. resources:
    41. - nodes
    42. verbs:
    43. - get
    44. - list
    45. - watch
    46. - patch
    47. - update
    48. attributeRestrictions: null
    49. - apiGroups:
    50. - extensions
    51. - apps
    52. resources:
    53. - daemonsets
    54. - replicasets
    55. - statefulsets
    56. verbs:
    57. - get
    58. - list
    59. - watch
    60. attributeRestrictions: null
    61. - apiGroups:
    62. - policy
    63. resources:
    64. - poddisruptionbudgets
    65. verbs:
    66. - get
    67. - list
    68. - watch
    69. attributeRestrictions: null
    70. EOF
    1If the cluster-autoscaler object exists, ensure that the pods/eviction rule exists with the verb create.
  4. Create a role for the deployment auto-scaler:

    1. $ oc apply -n cluster-autoscaler -f - <<EOF
    2. apiVersion: v1
    3. kind: Role
    4. metadata:
    5. name: cluster-autoscaler
    6. rules:
    7. - apiGroups:
    8. - ""
    9. resources:
    10. - configmaps
    11. resourceNames:
    12. - cluster-autoscaler
    13. - cluster-autoscaler-status
    14. verbs:
    15. - create
    16. - get
    17. - patch
    18. - update
    19. attributeRestrictions: null
    20. - apiGroups:
    21. - ""
    22. resources:
    23. - configmaps
    24. verbs:
    25. - create
    26. attributeRestrictions: null
    27. - apiGroups:
    28. - ""
    29. resources:
    30. - events
    31. verbs:
    32. - create
    33. attributeRestrictions: null
    34. EOF
  5. Create a creds file to store AWS credentials for the auto-scaler:

    1. cat <<EOF > creds
    2. [default]
    3. aws_access_key_id = your-aws-access-key-id
    4. aws_secret_access_key = your-aws-secret-access-key
    5. EOF

    The auto-scaler uses these credentials to launch new instances.

  6. Create the a secret that contains the AWS credentials:

    1. $ oc create secret -n cluster-autoscaler generic autoscaler-credentials --from-file=creds

    The auto-scaler uses this secret to launch instances within AWS.

  7. Create and grant cluster-reader role to the cluster-autoscaler service account that you created:

    1. $ oc adm policy add-cluster-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler
    2. $ oc adm policy add-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler --role-namespace cluster-autoscaler -n cluster-autoscaler
    3. $ oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler
  8. Deploy the cluster auto-scaler:

    1. $ oc apply -n cluster-autoscaler -f - <<EOF
    2. apiVersion: apps/v1
    3. kind: Deployment
    4. metadata:
    5. labels:
    6. app: cluster-autoscaler
    7. name: cluster-autoscaler
    8. namespace: cluster-autoscaler
    9. spec:
    10. replicas: 1
    11. selector:
    12. matchLabels:
    13. app: cluster-autoscaler
    14. role: infra
    15. template:
    16. metadata:
    17. labels:
    18. app: cluster-autoscaler
    19. role: infra
    20. spec:
    21. containers:
    22. - args:
    23. - /bin/cluster-autoscaler
    24. - --alsologtostderr
    25. - --v=4
    26. - --skip-nodes-with-local-storage=False
    27. - --leader-elect-resource-lock=configmaps
    28. - --namespace=cluster-autoscaler
    29. - --cloud-provider=aws
    30. - --nodes=0:6:mycluster-ASG
    31. env:
    32. - name: AWS_REGION
    33. value: us-east-1
    34. - name: AWS_SHARED_CREDENTIALS_FILE
    35. value: /var/run/secrets/aws-creds/creds
    36. image: docker.io/openshift/origin-cluster-autoscaler:v3.11.0
    37. name: autoscaler
    38. volumeMounts:
    39. - mountPath: /var/run/secrets/aws-creds
    40. name: aws-creds
    41. readOnly: true
    42. dnsPolicy: ClusterFirst
    43. nodeSelector:
    44. node-role.kubernetes.io/infra: "true"
    45. serviceAccountName: cluster-autoscaler
    46. terminationGracePeriodSeconds: 30
    47. volumes:
    48. - name: aws-creds
    49. secret:
    50. defaultMode: 420
    51. secretName: autoscaler-credentials
    52. EOF

Testing the auto-scaler

After you add the auto-scaler to your Amazon Web Services (AWS) cluster, you can confirm that the auto-scaler works by deploying more pods than the current nodes can run.

Prerequisites

  • You added the auto-scaler to your OKD cluster that runs on AWS.

Procedure

  1. Create the scale-up.yaml file that contains the deployment configuration to test auto-scaling:

    1. apiVersion: apps/v1
    2. kind: Deployment
    3. metadata:
    4. name: scale-up
    5. labels:
    6. app: scale-up
    7. spec:
    8. replicas: 20 (1)
    9. selector:
    10. matchLabels:
    11. app: scale-up
    12. template:
    13. metadata:
    14. labels:
    15. app: scale-up
    16. spec:
    17. containers:
    18. - name: origin-base
    19. image: openshift/origin-base
    20. resources:
    21. requests:
    22. memory: 2Gi
    23. command:
    24. - /bin/sh
    25. - "-c"
    26. - "echo 'this should be in the logs' && sleep 86400"
    27. terminationGracePeriodSeconds: 0
    1This deployment specifies 20 replicas, but the initial size of the cluster cannot run all of the pods without first increasing the number of compute nodes.
  2. Create a namespace for the deployment:

    1. $ oc apply -f - <<EOF
    2. apiVersion: v1
    3. kind: Namespace
    4. metadata:
    5. name: autoscaler-demo
    6. EOF
  3. Deploy the configuration:

    1. $ oc apply -n autoscaler-demo -f scale-up.yaml
  4. View the pods in your namespace:

    1. View the running pods in your namespace:

      1. $ oc get pods -n autoscaler-demo | grep Running
      2. cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d
      3. scale-up-79684ff956-45sbg 1/1 Running 0 31s
      4. scale-up-79684ff956-4kzjv 1/1 Running 0 31s
      5. scale-up-79684ff956-859d2 1/1 Running 0 31s
      6. scale-up-79684ff956-h47gv 1/1 Running 0 31s
      7. scale-up-79684ff956-htjth 1/1 Running 0 31s
      8. scale-up-79684ff956-m996k 1/1 Running 0 31s
      9. scale-up-79684ff956-pvvrm 1/1 Running 0 31s
      10. scale-up-79684ff956-qs9pp 1/1 Running 0 31s
      11. scale-up-79684ff956-zwdpr 1/1 Running 0 31s
    2. View the pending pods in your namespace:

      1. $ oc get pods -n autoscaler-demo | grep Pending
      2. scale-up-79684ff956-5jdnj 0/1 Pending 0 40s
      3. scale-up-79684ff956-794d6 0/1 Pending 0 40s
      4. scale-up-79684ff956-7rlm2 0/1 Pending 0 40s
      5. scale-up-79684ff956-9m2jc 0/1 Pending 0 40s
      6. scale-up-79684ff956-9m5fn 0/1 Pending 0 40s
      7. scale-up-79684ff956-fr62m 0/1 Pending 0 40s
      8. scale-up-79684ff956-q255w 0/1 Pending 0 40s
      9. scale-up-79684ff956-qc2cn 0/1 Pending 0 40s
      10. scale-up-79684ff956-qjn7z 0/1 Pending 0 40s
      11. scale-up-79684ff956-tdmqt 0/1 Pending 0 40s
      12. scale-up-79684ff956-xnjhw 0/1 Pending 0 40s

      These pending pods cannot run until the cluster auto-scaler automatically provisions new compute nodes to run the pods on. It can several minutes for the nodes have a Ready state in the cluster.

  1. After several minutes, check the list of nodes to see if new nodes are ready:

    1. $ oc get nodes
    2. NAME STATUS ROLES AGE VERSION
    3. ip-172-31-49-172.ec2.internal Ready infra 1d v1.11.0+d4cacc0
    4. ip-172-31-53-217.ec2.internal Ready compute 7m v1.11.0+d4cacc0
    5. ip-172-31-55-89.ec2.internal Ready compute 9h v1.11.0+d4cacc0
    6. ip-172-31-56-21.ec2.internal Ready compute 7m v1.11.0+d4cacc0
    7. ip-172-31-56-71.ec2.internal Ready compute 7m v1.11.0+d4cacc0
    8. ip-172-31-63-234.ec2.internal Ready master 1d v1.11.0+d4cacc0
  2. When more nodes are ready, view the running pods in your namespace again:

    1. $ oc get pods -n autoscaler-demo
    2. NAME READY STATUS RESTARTS AGE
    3. cluster-autoscaler-5485644d46-ggvn5 1/1 Running 0 1d
    4. scale-up-79684ff956-45sbg 1/1 Running 0 8m
    5. scale-up-79684ff956-4kzjv 1/1 Running 0 8m
    6. scale-up-79684ff956-5jdnj 1/1 Running 0 8m
    7. scale-up-79684ff956-794d6 1/1 Running 0 8m
    8. scale-up-79684ff956-7rlm2 1/1 Running 0 8m
    9. scale-up-79684ff956-859d2 1/1 Running 0 8m
    10. scale-up-79684ff956-9m2jc 1/1 Running 0 8m
    11. scale-up-79684ff956-9m5fn 1/1 Running 0 8m
    12. scale-up-79684ff956-fr62m 1/1 Running 0 8m
    13. scale-up-79684ff956-h47gv 1/1 Running 0 8m
    14. scale-up-79684ff956-htjth 1/1 Running 0 8m
    15. scale-up-79684ff956-m996k 1/1 Running 0 8m
    16. scale-up-79684ff956-pvvrm 1/1 Running 0 8m
    17. scale-up-79684ff956-q255w 1/1 Running 0 8m
    18. scale-up-79684ff956-qc2cn 1/1 Running 0 8m
    19. scale-up-79684ff956-qjn7z 1/1 Running 0 8m
    20. scale-up-79684ff956-qs9pp 1/1 Running 0 8m
    21. scale-up-79684ff956-tdmqt 1/1 Running 0 8m
    22. scale-up-79684ff956-xnjhw 1/1 Running 0 8m
    23. scale-up-79684ff956-zwdpr 1/1 Running 0 8m
    24. ...