AWS EKS cluster autoscaling

aws kubernetes


Let’s start with the basics: what is the Cluster Autoscaler?

It is a component that automatically adjusts the size of a Kubernetes Cluster - meaning adding or removing nodes based on demand.

You will have one or more node groups (EC2 autoscaling groups) which can be configured using this method; in the following example, I’ll be relying on tags for autodiscovering the groups which should be automatically resized.

There is an AWS documentation page which describes setting up the autoscaling cluster - however it doesn’t go into much detail and requires manual intervention using kubectl edit.

I can think of at least 2 better ways: Helm and kustomize.

The github page for cluster-autoscaler on AWS offers a lot more useful information, though I didn’t see a step by step guide.

Here’s an example which uses the cluster-autoscaler-chart Helm chart.

There are some additional explanations regarding the EKS setup in a previous post.


Setup the a test EKS cluster

git clone https://github.com/serbangilvitu/terraform-examples.git
cd terraform-examples/aws/eks

# Update aws_region and aws_profile in values-common.auto.tfvars

terraform init
terraform apply

export AWS_REGION=$(terraform output aws_region)
aws eks \
  --region ${AWS_REGION} \
  --profile $(terraform output aws_profile) \
  update-kubeconfig \
  --name $(terraform output eks_cluster_name)

export eks_node_group_1_role_arn="$(terraform output eks_node_group_1_role_arn)" && \
  curl -so - https://amazon-eks.s3.us-west-2.amazonaws.com/cloudformation/2020-07-23/aws-auth-cm.yaml \
  | sed -e 's/<ARN.*>/${eks_node_group_1_role_arn}/g' | envsubst \
  | tee | kubectl apply -f -

Deploy a cluster autoscaler using a Helm chart

helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm repo update
helm install cluster-autoscaler autoscaler/cluster-autoscaler-chart \
  --namespace kube-system \
  --version 1.0.4 \
  --set 'nameOverride'='cluster-autoscaler' \
  --set 'cloudProvider'='aws' \
  --set 'awsRegion'="${AWS_REGION}" \
  --set 'image.repository'='k8s.gcr.io/autoscaling/cluster-autoscaler' \
  --set 'image.tag'='v1.18.2' \
  --set 'autoDiscovery.clusterName'="$(terraform output eks_cluster_name)" \
  --set 'extraArgs.skip-nodes-with-system-pods'='false' \
  --set 'extraArgs.balance-similar-node-groups'='true' \
  --set 'extraArgs.expander'='least-waste' \
  --set 'extraArgs.skip-nodes-with-local-storage'='false'

kubectl -n kube-system \
  annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict="false"

Test autoscaling

kubectl -n kube-system logs -f deployment/cluster-autoscaler
kubectl -n default apply -f https://raw.githubusercontent.com/serbangilvitu/terraform-examples/master/aws/eks/yaml/scaling-test.yaml
watch -n 5 kubectl get nodes

To cleanup

terraform destroy

A closer look

As mentioned earlier, the autoscaling group has the tags expected by the cluster autoscaler - here’s an excerpt from the Terraform code.

    "k8s.io/cluster-autoscaler/${var.stack_name}-${var.eks_cluster_name}" = "owned"
    "k8s.io/cluster-autoscaler/enabled" = "true"

These tags could be overwritten by specifying the autoDiscovery.tags, however I’ll go with the current convention k8s.io/cluster-autoscaler/*.

Let’s continue with the values used by the autoscaler.

For some reason, in the value file the default cloudProvider is aws, however it’s best to specify it explicitly, so that you avoid surprises in future versions of the chart.

For the autodiscovery, I’m specifying the cluster name - in this case extracted from a terraform output.

In case of multiple autoscaling groups (this example only has 1) least-waste will expand the ASG that will waste the least amount of CPU/MEM resources.

I’ve also explicitly set awsRegion value, which defaults to us-east-1 .

Additional options can be found in the value file corresponding to the chart version, and some behavior is detailed in the aws cloudprovider documentation.

The complete list of arguments and their description is available in the FAQ.

The explanations for the ones I’ve explicitly specified:

To check the chart’s default extra arguments of the autoscaler:

helm show values autoscaler/cluster-autoscaler-chart \
  --version 1.0.4 \
  | grep 'extraArgs:' -A 20

To get a list of the user supplied values (which overwrite the defaults):

helm get values cluster-autoscaler