Bobcares

Autoscaling kubernetes cluster EKS

by | Jan 30, 2023

Wondering how to perform autoscaling kubernetes cluster EKS? Our AWS Support team is here to lend a hand with your queries and issues.

Autoscaling kubernetes cluster EKS

Today, let us see the steps followed by our support techs to implement Cluster Autoscaler.

STEP 1: Create an EKS cluster

This walkthrough will create an EKS cluster in AWS with two Auto Scaling groups to demonstrate how Cluster Autoscaler uses the autoscaling group to manage the EKS cluster.

When creating the EKS cluster, AWS automatically creates the EC2 Auto Scaling groups, but you must ensure that they contain the labels required by Cluster Autoscaler to discover them.

First, create an EKS cluster configuration file using the content shown below:

---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: demo-ca-cluster
  region: us-east-1
  version: "1.20"
availabilityZones:
- us-east-1a
- us-east-1b
managedNodeGroups:
- name: managed-nodes
  labels:
    role: managed-nodes
  instanceType: t3.medium
  minSize: 1
  maxSize: 10
  desiredCapacity: 1
  volumeSize: 20
nodeGroups:
- name: unmanaged-nodes
  labels:
    role: unmanaged-nodes
  instanceType: t3.medium
  minSize: 1
  maxSize: 10
  desiredCapacity: 1
  volumeSize: 20

Here, we are creating two Auto Scaling groups for the cluster:

  1. Managed-nodes
  2. Unmanaged-nodes

We will use the unmanaged nodes later in this exercise as part of a test to verify the proper functioning of the Cluster Autoscaler.

Next, use eksctl to create the EKS cluster using the command shown below.

$ eksctl create cluster -f eks.yaml

STEP 2: Verification of the EKS cluster and AWS Auto Scaling groups

We can verify using the kubectl command line:

$ kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.100.0.1           443/TCP   14m

We can also verify the presence of our cluster via the AWS console:

Our cluster, as displayed in the AWS Console

We can certify that the Auto Scaling groups are provisioned in the AWS console:

Our Auto Scaling groups in the AWS Console

STEP 3: Create IAM OIDC provider

IAM OIDC is used for authorizing the Cluster Autoscaler to launch or terminate instances under an Auto Scaling group. In this section, we will see how to configure it with the EKS cluster.

In the EKS cluster console, navigate to the configuration tab and copy the OpenID connect URL.

The OpenID we need to copy from the AWS console.

Then, go to the IAM console, and select Identity provider.

Selecting an identity provider in the AWS Console.

Click “Add provider,” select “OpenID Connect,” and click “Get thumbprint”.

Selecting OpenID and getting the thumbprint of a provider in the AWS Console.

Then enter the “Audience” (sts.amazonaws.com in our example pointing to the AWS STS, also known as the Security Token Service) and add the provider

Adding the provider in the AWS Console.

Note: You will need to attach the IAM role to use this provider—we’ll review that next.

Adding the identity information in the AWS Console.

STEP 4: Create IAM policy

Next, we need to create the IAM policy, which allows CA to increase or decrease the number of nodes in the cluster.

To create the policy with the necessary permissions, save the below file as “AmazonEKSClusterAutoscalerPolicy.json” or any name you want:

{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Action": [
              "autoscaling:DescribeAutoScalingGroups",
              "autoscaling:DescribeAutoScalingInstances",
              "autoscaling:DescribeLaunchConfigurations",
              "autoscaling:DescribeTags",
              "autoscaling:SetDesiredCapacity",
              "autoscaling:TerminateInstanceInAutoScalingGroup",
              "ec2:DescribeLaunchTemplateVersions"
          ],
          "Resource": "*",
          "Effect": "Allow"
      }
  ]
}

Then, create the policy by running the following AWS CLI command (learn more about installing and configuring AWS CLI here):

$ aws iam create-policy --policy-name AmazonEKSClusterAutoscalerPolicy --policy-document file://AmazonEKSClusterAutoscalerPolicy.json

Verification of the policy:

$ aws iam list-policies --max-items 1
{
    "NextToken": "eyJNYXJrZXIiOiBudWxsLCAiYm90b190cnVuY2F0ZV9hbW91bnQiOiAxfQ==",
    "Policies": [
        {
            "PolicyName": "AmazonEKSClusterAutoscalerPolicy",
            "PermissionsBoundaryUsageCount": 0,
            "CreateDate": "2021-10-24T15:02:46Z",
            "AttachmentCount": 0,
            "IsAttachable": true,
            "PolicyId": "ANPA4KZ4K7F2VD6DQVAZT",
            "DefaultVersionId": "v1",
            "Path": "/",
            "Arn": "arn:aws:iam::847845718389:policy/AmazonEKSClusterAutoscalerPolicy",
            "UpdateDate": "2021-10-24T15:02:46Z"
        }
    ]
}

STEP 5: Create an IAM role for the provider

As discussed earlier, we still need to create an IAM role and link it to the provider we created in Step 3.

Selecting a web identity and provider.

Select the Audience “sts.amazonaws.com” and attach the policy which you have created.

Then, verify the IAM role and make sure the policy is attached.

IAM role and policy in the AWS Console.

Edit the “Trust relationships.”

Editing “Trust relationships”.

Next, change the OIDC.

Changing the OIDC to edit a trust relationship.

Then click “Update Trust Policy” to save it.

STEP 6: Deploy Cluster Autoscaler

Next, we deploy Cluster Autoscaler. To do so, you must use the Amazon Resource Names (ARN) number of the IAM role created in our earlier step.

To deploy CA, save the following content presented after the command below in a file and run this provided command:

$ kubectl  apply -f <path of the file> 

The content intended to save into a file (make sure you copy all of the content presented over the next page):

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::847845718389:role/AmazonEKSClusterAutoscalerRole
  name: cluster-autoscaler
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["events", "endpoints"]
    verbs: ["create", "patch"]
  - apiGroups: [""]
    resources: ["pods/eviction"]
    verbs: ["create"]
  - apiGroups: [""]
    resources: ["pods/status"]
    verbs: ["update"]
  - apiGroups: [""]
    resources: ["endpoints"]
    resourceNames: ["cluster-autoscaler"]
    verbs: ["get", "update"]
  - apiGroups: [""]
    resources: ["nodes"]
    verbs: ["watch", "list", "get", "update"]
  - apiGroups: [""]
    resources:
      - "pods"
      - "services"
      - "replicationcontrollers"
      - "persistentvolumeclaims"
      - "persistentvolumes"
    verbs: ["watch", "list", "get"]
  - apiGroups: ["extensions"]
    resources: ["replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["policy"]
    resources: ["poddisruptionbudgets"]
    verbs: ["watch", "list"]
  - apiGroups: ["apps"]
    resources: ["statefulsets", "replicasets", "daemonsets"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses", "csinodes"]
    verbs: ["watch", "list", "get"]
  - apiGroups: ["batch", "extensions"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "patch"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create"]
  - apiGroups: ["coordination.k8s.io"]
    resourceNames: ["cluster-autoscaler"]
    resources: ["leases"]
    verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["create","list","watch"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
    verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: cluster-autoscaler
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system


---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    k8s-addon: cluster-autoscaler.addons.k8s.io
    k8s-app: cluster-autoscaler
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: cluster-autoscaler
subjects:
  - kind: ServiceAccount
    name: cluster-autoscaler
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  replicas: 1
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: 'false'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
        - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.20.0
          name: cluster-autoscaler
          resources:
            limits:
              cpu: 100m
              memory: 500Mi
            requests:
              cpu: 100m
              memory: 500Mi
          command:
            - ./cluster-autoscaler
            - --v=4
            - --stderrthreshold=info
            - --cloud-provider=aws
            - --skip-nodes-with-local-storage=false
            - --expander=least-waste
            - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/demo-ca-cluster
            - --balance-similar-node-groups
            - --skip-nodes-with-system-pods=false
          volumeMounts:
            - name: ssl-certs
              mountPath: /etc/ssl/certs/ca-certificates.crt #/etc/ssl/certs/ca-bundle.crt for Amazon Linux Worker Nodes
              readOnly: true
          imagePullPolicy: "Always"
      volumes:
        - name: ssl-certs
          hostPath:
            path: "/etc/ssl/certs/ca-bundle.crt"

For this step, the crucial parameters are:

  • –node-group-auto-discovery = This is used by CA to discover the Auto Scaling group based on its tag. Here is an example to illustrate the tag format: asg:tag=tagKey,anotherTagKey
  • V1.20.0 = This is the release version of the EKS cluster used in our example. You must update if you are running an older version.
  • –balance-similar-node = If you set the flag to “true,” CA will detect similar node groups and balance the number of nodes between them.
  • –skip-nodes-with-system-pods = If you set this flag to “true,” CA will never delete nodes that host a pod associated with the kube-system (except for DaemonSet or mirror pods).

Next, verify that you are using the correct kubeconfig:

$ kubectx
bob@demo-ca-cluster.us-east-1.eksctl.io

Then apply the changes by issuing the command shown below using the YAML configuration file created earlier in this step using the provided content:

Next, verify the logs by issuing this command:

$ kubectl logs -l app=cluster-autoscaler -n kubesystem -f

The sections below highlighted in red indicate that the command ran successfully.

CA will now check for unscheduled pods and try to schedule them. You can see those actions from the logs. Check the status of the pods by issuing the following command:

$ kubectl get pods -n kube-system

The expected results are displayed below.

Check the number of nodes in the EKS cluster:

Congratulations! You have deployed the Cluster Autoscaler successfully.

STEP 7: Create an Nginx deployment to test autoscaler functionality

We are going to create two deployments: one for the managed node group, and another deployment for the unmanaged node group.

Manage node group deployment:

Create a configuration file based on the content below:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-managed
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-managed
  template:
    metadata:
      labels:
        app: nginx-managed
    spec:
      containers:
      - name: nginx-managed
        image: nginx:1.14.2
        ports:
        - containerPort: 80
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: role
                operator: In
                values:
                - managed-nodes
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx-managed
            topologyKey: kubernetes.io/hostname
            namespaces:
            - default

Note: The above configurations make use of nodeAffinity to select the node group with the label “role=managed-nodes” to help control where the scheduler provisions the pods.

Apply the changes:

$ kubectl apply -f 1-nginx-managed.yaml
deployment.apps/nginx-managed created

Unmanaged Node group Deployment:

For the unmanaged node group, create a configuration file using the content below

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-unmanaged
  namespace: default
spec:  replicas: 2
  selector:
    matchLabels:
      app: nginx-unmanaged
  template:
    metadata:
      labels:
        app: nginx-unmanaged
    spec:
      containers:
      - name: nginx-unmanaged
        image: nginx:1.14.2
        ports:
        - containerPort: 80
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: role
                operator: In
                values:
                - unmanaged-nodes
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx-unmanaged
            topologyKey: kubernetes.io/hostname
            namespaces:
            - default

Apply the changes

$ kubectl apply -f 2-nginx-unmanaged.yaml
deployment.apps/nginx-unmanaged created

Check the status of the pods.

$ kubectl get pods -n default
NAME                               READY   STATUS    RESTARTS   AGE
nginx-managed-7cf8b6449c-mctsg     1/1     Running   0          60s
nginx-managed-7cf8b6449c-vjvxf     0/1     Pending   0          60s
nginx-unmanaged-67dcfb44c9-gvjg4   0/1     Pending   0          52s
nginx-unmanaged-67dcfb44c9-wqnvr   1/1     Running   0          52s

Now, you can see two of the four pods are running because we have only two nodes in the cluster.

The Cluster Autoscaler will check the state of the pods, discover that some are in a “pending” state, and try to provision new nodes in the cluster. In a few minutes, you will see a third node provisioned.

One pod is still in a pending state because we did not add the label when we created the EKS cluster with managed/unmanaged node groups.

If the label is not present in the Auto Scaling group, then the Cluster Autoscaler will not discover the Auto Scaling group to scale the cluster.

Conclusion

To sum up, our Support Engineers demonstrated how to perform autoscaling kubernetes cluster EKS.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.