Configuring the cluster auto-scaler in AWS

Configuring the cluster auto-scaler in AWS

You can configure an auto-scaler on your OKD cluster in Amazon Web Services (AWS) to provide elasticity for your application workload. The auto-scaler ensures that enough nodes are active to run your pods and that the number of active nodes is proportional to current demand.

You can run the auto-scaler only on AWS.

About the OKD auto-scaler

The auto-scaler in OKD repeatedly checks to see how many pods are pending node allocation. If pods are pending allocation and the auto-scaler has not met its maximum capacity, then new nodes are continuously provisioned to accommodate the current demand. When demand drops and fewer nodes are required, the auto-scaler removes unused nodes. After you install the auto-scaler, its behavior is automatic. You only need to add the desired number of replicas to the deployment.

In OKD version 3.11, you can deploy the auto-scaler only on Amazon Web Services (AWS). The auto-scaler uses some standard AWS objects to manage your cluster size, including Auto Scaling groups and Launch Configurations.

The auto-scaler uses the following assets:

Auto Scaling groups

An Auto Scaling group is a logical representation of a set of machines. You configure an Auto Scaling group with a minimum number of instances to run, the maximum number of instances that can run, and your desired number of instances to run. An Auto Scaling group starts by launching enough instances to meet your desired capacity. You can configure an Auto Scaling group to start with zero instances.

Launch Configurations

A Launch Configuration is a template that an Auto Scaling group uses to launch instances. When you create a Launch Configuration, you specify information such as:

The ID of the Amazon Machine Image (AMI) to use as the base image
The instance type, such as m4.large
A key pair
One or more security groups
The subnets to apply the Launch Configuration to

OKD primed images

When the Auto Scaling group provisions a new instance, the image that it launches must have OKD already prepared. The Auto Scaling group uses this image to both automatically bootstrap the node and enroll it within the cluster without any manual intervention.

Creating a primed image

You can use Ansible playbooks to automatically create a primed image for the auto-scaler to use. You must provide attributes from your existing Amazon Web Services (AWS) cluster.

If you already have a primed image, you can use it instead of creating a new one.

Procedure

On the host that you used to create your OKD cluster, create a primed image:

Create a new Ansible inventory file on your local host:

[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
openshift_deployment_type=origin
ansible_ssh_user=ec2-user
openshift_clusterid=mycluster
ansible_become=yes
[masters]
[etcd]
[nodes]

Create provisioning file, build-ami-provisioning-vars.yaml, on your local host:

openshift_deployment_type: origin
openshift_aws_clusterid: mycluster (1)
openshift_aws_region: us-east-1 (2)
openshift_aws_create_vpc: false (3)
openshift_aws_vpc_name: production (4)
openshift_aws_subnet_az: us-east-1d (5)
openshift_aws_create_security_groups: false (6)
openshift_aws_ssh_key_name: production-ssh-key (7)
openshift_aws_base_ami: ami-12345678 (8)
openshift_aws_create_s3: False (9)
openshift_aws_build_ami_group: default (10)
openshift_aws_vpc: (11)
  name: "{{ openshift_aws_vpc_name }}"
  cidr: 172.18.0.0/16
  subnets:
    us-east-1:
    - cidr: 172.18.0.0/20
      az: "us-east-1d"
container_runtime_docker_storage_type: overlay2 (12)
container_runtime_docker_storage_setup_device: /dev/xvdb (13)

1	Provide the name of the existing cluster.
2	Provide the region the existing cluster is currently running in.
3	Specify `False` to disable the creation of a VPC.
4	Provide the existing VPC name that the cluster is running in.
5	Provide the name of a subnet the existing cluster is running in.
6	Specify `False` to disable the creation of security groups.
7	Provide the AWS key name to use for SSH access.
8	Provide the AMI image ID to use as the base image for the primed image. See Red Hat® Cloud Access.
9	Specify `False` to disable the creation of an S3 bucket.
10	Provide the security group name.
11	Provide the VPC subnets the existing cluster is running in.
12	Specify `overlay2` as the Docker storage type.
13	Specify the mount point for LVM and the */var/lib/docker* directory.

Run the build_ami.yml playbook to generate a primed image:
```
# ansible-playbook -i </path/to/inventory/file> \
    ~/openshift-ansible/playbooks/aws/openshift-cluster/build_ami.yml \
    -e @build-ami-provisioning-vars.yaml
```
After the playbook runs, you see a new image ID, or AMI, in its output. You specify the AMI that it generated when you create the Launch Configuration.

Creating the launch configuration and Auto Scaling group

Before you deploy the cluster auto-scaler, you must create an Amazon Web Services (AWS) launch configuration and Auto Scaling group that reference a primed image. You must configure the launch configuration so that the new node automatically joins the existing cluster when it starts.

Prerequisites

Install an OKD cluster in AWS.
Create a primed image.
If you deployed the EFK stack in your cluster, set the node label to logging-infra-fluentd=true.

Procedure

Create the bootstrap.kubeconfig file by generating it from a master node:

$ ssh master "sudo oc serviceaccounts create-kubeconfig -n openshift-infra node-bootstrapper" > ~/bootstrap.kubeconfig

Create the user-data.txt cloud-init file from the bootstrap.kubeconfig file:

$ cat <<EOF > user-data.txt
#cloud-config
write_files:
- path: /root/openshift_bootstrap/openshift_settings.yaml
  owner: 'root:root'
  permissions: '0640'
  content: |
    openshift_node_config_name: node-config-compute
- path: /etc/origin/node/bootstrap.kubeconfig
  owner: 'root:root'
  permissions: '0640'
  encoding: b64
  content: |
    $(base64 ~/bootstrap.kubeconfig | sed '2,$s/^/    /')
runcmd:
- [ ansible-playbook, /root/openshift_bootstrap/bootstrap.yml]
- [ systemctl, restart, systemd-hostnamed]
- [ systemctl, restart, NetworkManager]
- [ systemctl, enable, origin-node]
- [ systemctl, start, origin-node]
EOF

Upload a launch configuration template to an AWS S3 bucket.

Create the launch configuration by using the AWS CLI:

$ aws autoscaling create-launch-configuration \
    --launch-configuration-name mycluster-LC \ (1)
    --region us-east-1 \ (2)
    --image-id ami-987654321 \ (3)
    --instance-type m4.large \ (4)
    --security-groups sg-12345678 \ (5)
    --template-url https://s3-.amazonaws.com/.../yourtemplate.json \  (6)
    --key-name production-key \ (7)

1	Specify a launch configuration name.
2	Specify the region to launch the image in.
3	Specify the primed image AMI that you created.
4	Specify the type of instance to launch.
5	Specify the security groups to attach to the launched image.
6	Specify the launch configuration template that you uploaded.
7	Specify the SSH key-pair name.

If your template is fewer than 16 KB before you encode it, you can provide it using the AWS CLI by substituting —template-url with —user-data.

Create the Auto Scaling group by using the AWS CLI:

$ aws autoscaling create-auto-scaling-group \
      --auto-scaling-group-name mycluster-ASG \ (1)
      --launch-configuration-name mycluster-LC \ (2)
      --min-size 0 \ (3)
      --max-size 6 \ (4)
      --vpc-zone-identifier subnet-12345678 \ (5)
      --tags ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=Name,Value=mycluster-ASG-node,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=kubernetes.io/cluster/mycluster,Value=true,PropagateAtLaunch=true ResourceId=mycluster-ASG,ResourceType=auto-scaling-group,Key=k8s.io/cluster-autoscaler/node-template/label/node-role.kubernetes.io/compute,Value=true,PropagateAtLaunch=true (6)

1	Specify the name of the Auto Scaling group, which you use when you deploy the auto-scaler deployment
2	Specify the name of the Launch Configuration that you created.
3	Specify the minimum number of nodes that the auto-scaler maintains.
4	Specify the maximum number of nodes the scale group can expand to.
5	Specify the VPC subnet-id, which is the same subnet that the cluster uses.
6	Specify this string to ensure that Auto Scaling group tags are propagated to the nodes when they launch.

Deploying the auto-scaler components on your cluster

After you create the Launch Configuration and Auto Scaling group, you can deploy the auto-scaler components onto the cluster.

Prerequisites

Install a OKD cluster in AWS.
Create a primed image.
Create a Launch Configuration and Auto Scaling group that reference the primed image.

Procedure

To deploy the auto-scaler:

Update your cluster to run the auto-scaler:

Add the following parameter to the inventory file that you used to create the cluster, by default /etc/ansible/hosts:
```
openshift_master_bootstrap_auto_approve=true
```

To obtain the auto-scaler components, change to the playbook directory and run the playbook again:

$ cd /usr/share/ansible/openshift-ansible
$ ansible-playbook -i </path/to/inventory/file> \
    playbooks/deploy_cluster.yml

Confirm that the bootstrap-autoapprover pod is running:

$ oc get pods --all-namespaces | grep bootstrap-autoapprover
NAMESPACE               NAME                                             READY     STATUS    RESTARTS   AGE
openshift-infra         bootstrap-autoapprover-0                         1/1       Running   0

Create a namespace for the auto-scaler:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: cluster-autoscaler
  annotations:
    openshift.io/node-selector: ""
EOF

Create a service account for the auto-scaler:

$ oc apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: cluster-autoscaler
EOF

Create a cluster role to grant the required permissions to the service account:

$ oc apply -n cluster-autoscaler -f - <<EOF
apiVersion: v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
rules:
- apiGroups: (1)
  - ""
  resources:
  - pods/eviction
  verbs:
  - create
  attributeRestrictions: null
- apiGroups:
  - ""
  resources:
  - persistentvolumeclaims
  - persistentvolumes
  - pods
  - replicationcontrollers
  - services
  verbs:
  - get
  - list
  - watch
  attributeRestrictions: null
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - get
  - list
  - watch
  - patch
  - create
  attributeRestrictions: null
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
  - patch
  - update
  attributeRestrictions: null
- apiGroups:
  - extensions
  - apps
  resources:
  - daemonsets
  - replicasets
  - statefulsets
  verbs:
  - get
  - list
  - watch
  attributeRestrictions: null
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - get
  - list
  - watch
  attributeRestrictions: null
EOF

1	If the `cluster-autoscaler` object exists, ensure that the `pods/eviction` rule exists with the verb `create`.

Create a role for the deployment auto-scaler:

$ oc apply -n cluster-autoscaler -f - <<EOF
apiVersion: v1
kind: Role
metadata:
  name: cluster-autoscaler
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  resourceNames:
  - cluster-autoscaler
  - cluster-autoscaler-status
  verbs:
  - create
  - get
  - patch
  - update
  attributeRestrictions: null
- apiGroups:
  - ""
  resources:
  - configmaps
  verbs:
  - create
  attributeRestrictions: null
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  attributeRestrictions: null
EOF

Create a creds file to store AWS credentials for the auto-scaler:

cat <<EOF > creds
[default]
aws_access_key_id = your-aws-access-key-id
aws_secret_access_key = your-aws-secret-access-key
EOF

The auto-scaler uses these credentials to launch new instances.

Create the a secret that contains the AWS credentials:
```
$ oc create secret -n cluster-autoscaler generic autoscaler-credentials --from-file=creds
```
The auto-scaler uses this secret to launch instances within AWS.

Create and grant cluster-reader role to the cluster-autoscaler service account that you created:

$ oc adm policy add-cluster-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler
$ oc adm policy add-role-to-user cluster-autoscaler system:serviceaccount:cluster-autoscaler:cluster-autoscaler --role-namespace cluster-autoscaler -n cluster-autoscaler
$ oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:cluster-autoscaler:cluster-autoscaler -n cluster-autoscaler

Deploy the cluster auto-scaler:

$ oc apply -n cluster-autoscaler -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: cluster-autoscaler
  name: cluster-autoscaler
  namespace: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
      role: infra
  template:
    metadata:
      labels:
    app: cluster-autoscaler
    role: infra
    spec:
      containers:
      - args:
    - /bin/cluster-autoscaler
    - --alsologtostderr
    - --v=4
    - --skip-nodes-with-local-storage=False
    - --leader-elect-resource-lock=configmaps
    - --namespace=cluster-autoscaler
    - --cloud-provider=aws
    - --nodes=0:6:mycluster-ASG
    env:
    - name: AWS_REGION
      value: us-east-1
    - name: AWS_SHARED_CREDENTIALS_FILE
      value: /var/run/secrets/aws-creds/creds
    image: docker.io/openshift/origin-cluster-autoscaler:v3.11.0
    name: autoscaler
    volumeMounts:
    - mountPath: /var/run/secrets/aws-creds
      name: aws-creds
      readOnly: true
      dnsPolicy: ClusterFirst
      nodeSelector:
    node-role.kubernetes.io/infra: "true"
      serviceAccountName: cluster-autoscaler
      terminationGracePeriodSeconds: 30
      volumes:
      - name: aws-creds
    secret:
      defaultMode: 420
      secretName: autoscaler-credentials
EOF

Testing the auto-scaler

After you add the auto-scaler to your Amazon Web Services (AWS) cluster, you can confirm that the auto-scaler works by deploying more pods than the current nodes can run.

Prerequisites

You added the auto-scaler to your OKD cluster that runs on AWS.

Procedure

Create the scale-up.yaml file that contains the deployment configuration to test auto-scaling:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: scale-up
  labels:
    app: scale-up
spec:
  replicas: 20 (1)
  selector:
    matchLabels:
      app: scale-up
  template:
    metadata:
      labels:
        app: scale-up
    spec:
      containers:
      - name: origin-base
        image: openshift/origin-base
        resources:
          requests:
            memory: 2Gi
        command:
        - /bin/sh
        - "-c"
        - "echo 'this should be in the logs' && sleep 86400"
      terminationGracePeriodSeconds: 0

1	This deployment specifies 20 replicas, but the initial size of the cluster cannot run all of the pods without first increasing the number of compute nodes.

Create a namespace for the deployment:

$ oc apply -f - <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: autoscaler-demo
EOF

Deploy the configuration:

$ oc apply -n autoscaler-demo -f scale-up.yaml

View the pods in your namespace:

View the running pods in your namespace:

$ oc get pods -n autoscaler-demo | grep Running
cluster-autoscaler-5485644d46-ggvn5   1/1       Running   0          1d
scale-up-79684ff956-45sbg             1/1       Running   0          31s
scale-up-79684ff956-4kzjv             1/1       Running   0          31s
scale-up-79684ff956-859d2             1/1       Running   0          31s
scale-up-79684ff956-h47gv             1/1       Running   0          31s
scale-up-79684ff956-htjth             1/1       Running   0          31s
scale-up-79684ff956-m996k             1/1       Running   0          31s
scale-up-79684ff956-pvvrm             1/1       Running   0          31s
scale-up-79684ff956-qs9pp             1/1       Running   0          31s
scale-up-79684ff956-zwdpr             1/1       Running   0          31s

View the pending pods in your namespace:

$ oc get pods -n autoscaler-demo | grep Pending
scale-up-79684ff956-5jdnj             0/1       Pending   0          40s
scale-up-79684ff956-794d6             0/1       Pending   0          40s
scale-up-79684ff956-7rlm2             0/1       Pending   0          40s
scale-up-79684ff956-9m2jc             0/1       Pending   0          40s
scale-up-79684ff956-9m5fn             0/1       Pending   0          40s
scale-up-79684ff956-fr62m             0/1       Pending   0          40s
scale-up-79684ff956-q255w             0/1       Pending   0          40s
scale-up-79684ff956-qc2cn             0/1       Pending   0          40s
scale-up-79684ff956-qjn7z             0/1       Pending   0          40s
scale-up-79684ff956-tdmqt             0/1       Pending   0          40s
scale-up-79684ff956-xnjhw             0/1       Pending   0          40s

These pending pods cannot run until the cluster auto-scaler automatically provisions new compute nodes to run the pods on. It can several minutes for the nodes have a Ready state in the cluster.

After several minutes, check the list of nodes to see if new nodes are ready:

$ oc get nodes
NAME                            STATUS    ROLES     AGE       VERSION
ip-172-31-49-172.ec2.internal   Ready     infra     1d        v1.11.0+d4cacc0
ip-172-31-53-217.ec2.internal   Ready     compute   7m        v1.11.0+d4cacc0
ip-172-31-55-89.ec2.internal    Ready     compute   9h        v1.11.0+d4cacc0
ip-172-31-56-21.ec2.internal    Ready     compute   7m        v1.11.0+d4cacc0
ip-172-31-56-71.ec2.internal    Ready     compute   7m        v1.11.0+d4cacc0
ip-172-31-63-234.ec2.internal   Ready     master    1d        v1.11.0+d4cacc0

When more nodes are ready, view the running pods in your namespace again:

$ oc get pods -n autoscaler-demo
NAME                                  READY     STATUS    RESTARTS   AGE
cluster-autoscaler-5485644d46-ggvn5   1/1       Running   0          1d
scale-up-79684ff956-45sbg             1/1       Running   0          8m
scale-up-79684ff956-4kzjv             1/1       Running   0          8m
scale-up-79684ff956-5jdnj             1/1       Running   0          8m
scale-up-79684ff956-794d6             1/1       Running   0          8m
scale-up-79684ff956-7rlm2             1/1       Running   0          8m
scale-up-79684ff956-859d2             1/1       Running   0          8m
scale-up-79684ff956-9m2jc             1/1       Running   0          8m
scale-up-79684ff956-9m5fn             1/1       Running   0          8m
scale-up-79684ff956-fr62m             1/1       Running   0          8m
scale-up-79684ff956-h47gv             1/1       Running   0          8m
scale-up-79684ff956-htjth             1/1       Running   0          8m
scale-up-79684ff956-m996k             1/1       Running   0          8m
scale-up-79684ff956-pvvrm             1/1       Running   0          8m
scale-up-79684ff956-q255w             1/1       Running   0          8m
scale-up-79684ff956-qc2cn             1/1       Running   0          8m
scale-up-79684ff956-qjn7z             1/1       Running   0          8m
scale-up-79684ff956-qs9pp             1/1       Running   0          8m
scale-up-79684ff956-tdmqt             1/1       Running   0          8m
scale-up-79684ff956-xnjhw             1/1       Running   0          8m
scale-up-79684ff956-zwdpr             1/1       Running   0          8m
...