aws · mozammil89 · Nov 5, 2024 · Nov 5, 2024 · Nov 6, 2024 · Nov 6, 2024
@@ -16,59 +16,85 @@ We will make the following assumptions in this guide
 * Your workloads have pod disruption budgets that adhere to [EKS best practices](https://aws.github.io/aws-eks-best-practices/karpenter/)
 * Your cluster has an [OIDC provider](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html) for service accounts
 
-This guide will also assume you have the `aws` CLI installed.
+This guide will also assume you have the `aws` CLI and `helm` installed.
 You can also perform many of these steps in the console, but we will use the command line for simplicity.
 
-Set a variable for your cluster name.
+## Set environment variables
+Set the Karpenter and Kubernetes version. Check the [Compatibility Matrix](https://karpenter.sh/docs/upgrading/compatibility/) to find the Karpenter version compatible with your current Amazon EKS version.
 
 ```bash
-KARPENTER_NAMESPACE=kube-system
+KARPENTER_NAMESPACE="kube-system"
+KARPENTER_VERSION="{{< param "latest_release_version" >}}"
+K8S_VERSION="{{< param "latest_k8s_version" >}}"
 CLUSTER_NAME=<your cluster name>
 ```
 
 Set other variables from your cluster configuration.
 
 {{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step01-env.sh" language="bash" %}}
 
-Use that information to create our IAM roles, inline policy, and trust relationship.
+{{% alert title="Warning" color="warning" %}}
+If you open a new shell to run steps in this procedure, you need to set the environment variables again.
+{{% /alert %}}
 
 ## Create IAM roles
 
-To get started with our migration we first need to create two new IAM roles for nodes provisioned with Karpenter and the Karpenter controller.
-
-To create the Karpenter node role we will use the following policy and commands.
-
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step02-node-iam.sh" language="bash" %}}
+Use CloudFormation to set up the infrastructure needed by the existing EKS cluster. See [CloudFormation]({{< relref "../../reference/cloudformation/" >}}) for a complete description of what `cloudformation.yaml` does for Karpenter. The provided `cloudformation.yaml` template simplifies this setup by creating and configuring all necessary resources, including:
 
-Now attach the required policies to the role
+  - **IAM Roles and Policies**: Grants Karpenter permissions to interact with EKS, autoscaling, and EC2 services, enabling it to manage nodes dynamically.
+  - **Instance Profiles**: Attaches necessary permissions to EC2 instances, allowing them to join the cluster and participate in automated scaling as managed by Karpenter.
+  - **Interruption Queue and Policies**: Setup Amazon SQS queue and Event Rules for handling interruption notifications from AWS services related to EC2 Spot instances and AWS Health events.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step03-node-policies.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step02-cloudformation-setup.sh" language="bash" %}}
 
 Now we need to create an IAM role that the Karpenter controller will use to provision new instances.
 The controller will be using [IAM Roles for Service Accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) which requires an OIDC endpoint.
 
-If you have another option for using IAM credentials with workloads (e.g. [kube2iam](https://github.com/jtblin/kube2iam)) your steps will be different.
+If you have another option for using IAM credentials with workloads (e.g. [Amazon EKS Pod Identity Agent](https://github.com/aws/eks-pod-identity-agent)) your steps will be different.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step04-controller-iam.sh" language="bash" %}}
+
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step03-controller-iam.sh" language="bash" %}}
 
 ## Add tags to subnets and security groups
 
-We need to add tags to our nodegroup subnets so Karpenter will know which subnets to use.
+In order for Karpenter to know which [subnets](https://karpenter.sh/docs/concepts/nodeclasses/#specsecuritygroupselectorterms) and [security groups](https://karpenter.sh/docs/concepts/nodeclasses/#specsecuritygroupselectorterms) to use, we need to add appropriate tags to the nodegroup subnets and security groups.
+
+### Tag nodegroup subnets
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step05-tag-subnets.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step04-tag-subnets.sh" language="bash" %}}
 
-Add tags to our security groups.
-This command only tags the security groups for the first nodegroup in the cluster.
-If you have multiple nodegroups or multiple security groups you will need to decide which one Karpenter should use.
+This loop ensures that Karpenter will be aware of which subnets are associated with each nodegroup by tagging them with karpenter.sh/discovery.
+
+### Tag security groups
+
+If your EKS setup is configured to use cluster security group and additional security groups, execute the following commands to tag them for Karpenter discovery:
+
+```bash
+SECURITY_GROUPS=$(aws eks describe-cluster \
+    --name "${CLUSTER_NAME}" \
+    --query "cluster.resourcesVpcConfig" \
+    --output json | jq -r '[.clusterSecurityGroupId] + .securityGroupIds | join(" ")')
+
+aws ec2 create-tags \
+    --tags "Key=karpenter.sh/discovery,Value=${CLUSTER_NAME}" \
+    --resources ${SECURITY_GROUPS}
+```
+
+If your setup uses security groups from the Launch template of a managed nodegroup, execute the following:
+
+Note that this command will only tag the security groups for the first nodegroup in the cluster. If you have multiple nodegroups groups, you will need to decide which ones Karpenter should use.
+
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step05-tag-security-groups.sh" language="bash" %}}
+
+Alternatively, the subnets and security groups can also be defined in the [NodeClasses](https://karpenter.sh/docs/concepts/nodeclasses/) definition by specifying the [subnets](https://karpenter.sh/docs/concepts/nodeclasses/#specsubnets) and [security groups](https://karpenter.sh/docs/concepts/nodeclasses/#specsecuritygroupselectorterms) to be used.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step06-tag-security-groups.sh" language="bash" %}}
 
 ## Update aws-auth ConfigMap
 
 We need to allow nodes that are using the node IAM role we just created to join the cluster.
 To do that we have to modify the `aws-auth` ConfigMap in the cluster.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step07-edit-aws-auth.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step06-edit-aws-auth.sh" language="bash" %}}
 
 You will need to add a section to the mapRoles that looks something like this.
 Replace the `${AWS_PARTITION}` variable with the account partition, `${AWS_ACCOUNT_ID}` variable with your account ID, and `${CLUSTER_NAME}` variable with the cluster name, but do not replace the `{{EC2PrivateDNSName}}`.
@@ -88,53 +114,54 @@ The full aws-auth configmap should have two groups.
 One for your Karpenter node role and one for your existing node group.
 
 ## Deploy Karpenter
+To deploy Karpenter, you can use Helm, which simplifies the installation process by handling Karpenter’s dependencies and configuration files automatically. The Helm command provided below will also incorporate any customized settings, such as node affinity, to align with your specific deployment needs.
 
-First set the Karpenter release you want to deploy.
-
-```bash
-export KARPENTER_VERSION="{{< param "latest_release_version" >}}"
-```
+### Set Node Affinity for Karpenter
 
-We can now generate a full Karpenter deployment yaml from the Helm chart.
+To optimize resource usage and ensure that Karpenter schedules its pods on nodes within a specific, existing node group, it is essential to configure node affinity.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step08-generate-chart.sh" language="bash" %}}
+Create a file named karpenter-node-affinity.yaml to define the node affinity settings and specify the node group where you want Karpenter to deploy.
 
-Modify the following lines in the karpenter.yaml file.
+Be sure to replace `${NODEGROUP}` with the actual name of your node group.
 
-### Set node affinity
-
-Edit the karpenter.yaml file and find the karpenter deployment affinity rules.
-Modify the affinity so karpenter will run on one of the existing node group nodes.
-
-The rules should look something like this.
-Modify the value to match your `$NODEGROUP`, one node group per line.
-
-```yaml
+```bash
+cat <<EOF > karpenter-node-affinity.yaml
 affinity:
   nodeAffinity:
     requiredDuringSchedulingIgnoredDuringExecution:
       nodeSelectorTerms:
-      - matchExpressions:
-        - key: karpenter.sh/nodepool
-          operator: DoesNotExist
-        - key: eks.amazonaws.com/nodegroup
-          operator: In
-          values:
-          - ${NODEGROUP}
-  podAntiAffinity:
-    requiredDuringSchedulingIgnoredDuringExecution:
-      - topologyKey: "kubernetes.io/hostname"
+        - matchExpressions:
+            - key: karpenter.sh/nodepool
+              operator: DoesNotExist
+            - key: eks.amazonaws.com/nodegroup
+              operator: In
+              values:
+                - "${NODEGROUP}"
+EOF
 ```
 
-Now that our deployment is ready we can create the karpenter namespace, create the NodePool CRD, and then deploy the rest of the karpenter resources.
+Now that you have prepared the node affinity configuration, you can proceed to install Karpenter using Helm. This command includes the affinity settings along with other necessary configurations:
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step09-deploy.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step07-deploy.sh" language="bash" %}}
+
+Expected output:
+```bash
+Release "karpenter" does not exist. Installing it now.
+Pulled: public.ecr.aws/karpenter/karpenter:1.0.5
+Digest: sha256:98382d6406a3c85711269112fbb337c056d4debabaefb936db2d10137b58bd1b
+NAME: karpenter
+LAST DEPLOYED: Wed Nov  6 16:51:41 2024
+NAMESPACE: kube-system
+STATUS: deployed
+REVISION: 1
+TEST SUITE: None
+```
 
 ## Create default NodePool
 
 We need to create a default NodePool so Karpenter knows what types of nodes we want for unscheduled workloads. You can refer to some of the [example NodePool](https://github.com/aws/karpenter/tree{{< githubRelRef >}}examples/v1) for specific needs.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step10-create-nodepool.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step08-create-nodepool.sh" language="bash" %}}
 
 ## Set nodeAffinity for critical workloads (optional)
 
@@ -165,33 +192,34 @@ affinity:
 Now that karpenter is running we can disable the cluster autoscaler.
 To do that we will scale the number of replicas to zero.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step11-scale-cas.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step09-scale-cas.sh" language="bash" %}}
 
 To get rid of the instances that were added from the node group we can scale our nodegroup down to a minimum size to support Karpenter and other critical services.
 
 > Note: If your workloads do not have [pod disruption budgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) set, the following command **will cause workloads to be unavailable.**
 
 If you have a single multi-AZ node group, we suggest a minimum of 2 instances.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step12-scale-single-ng.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step10-scale-single-ng.sh" language="bash" %}}
 
 Or, if you have multiple single-AZ node groups, we suggest a minimum of 1 instance each.
 
-{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step12-scale-multiple-ng.sh" language="bash" %}}
+{{% script file="./content/en/{VERSION}/getting-started/migrating-from-cas/scripts/step11-scale-multiple-ng.sh" language="bash" %}}
 
 {{% alert title="Note" color="warning" %}}
 If you have a lot of nodes or workloads you may want to slowly scale down your node groups by a few instances at a time. It is recommended to watch the transition carefully for workloads that may not have enough replicas running or disruption budgets configured.
 {{% /alert %}}
 
+
 ## Verify Karpenter
 
 As nodegroup nodes are drained you can verify that Karpenter is creating nodes for your workloads.
 
 ```bash
-kubectl logs -f -n karpenter -c controller -l app.kubernetes.io/name=karpenter
+kubectl logs -f -n $KARPENTER_NAMESPACE -c controller -l app.kubernetes.io/name=karpenter
 ```
 
-You should also see new nodes created in your cluster as the old nodes are removed
+You should also see new nodes created in your cluster as the old nodes are removed.
 
 ```bash
 kubectl get nodes

@@ -4,7 +4,7 @@ OIDC_ENDPOINT="$(aws eks describe-cluster --name "${CLUSTER_NAME}" \
     --query "cluster.identity.oidc.issuer" --output text)"
 AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' \
     --output text)
-K8S_VERSION=1.28
+TEMPOUT="$(mktemp)"
 ARM_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-arm64/recommended/image_id --query Parameter.Value --output text)"
 AMD_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)"
 GPU_AMI_ID="$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2-gpu/recommended/image_id --query Parameter.Value --output text)"
@@ -0,0 +1,6 @@
+curl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml  > "${TEMPOUT}" \
+&& aws cloudformation deploy \
+  --stack-name "Karpenter-${CLUSTER_NAME}" \
+  --template-file "${TEMPOUT}" \
+  --capabilities CAPABILITY_NAMED_IAM \
+  --parameter-overrides "ClusterName=${CLUSTER_NAME}"
@@ -0,0 +1,26 @@
+cat << EOF > controller-trust-policy.json
+{
+    "Version": "2012-10-17",
+    "Statement": [
+        {
+            "Effect": "Allow",
+            "Principal": {
+                "Federated": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT#*//}"
+            },
+            "Action": "sts:AssumeRoleWithWebIdentity",
+            "Condition": {
+                "StringEquals": {
+                    "${OIDC_ENDPOINT#*//}:aud": "sts.amazonaws.com",
+                    "${OIDC_ENDPOINT#*//}:sub": "system:serviceaccount:${KARPENTER_NAMESPACE}:karpenter"
+                }
+            }
+        }
+    ]
+}
+EOF
+
+aws iam create-role --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
+    --assume-role-policy-document file://controller-trust-policy.json
+
+aws iam attach-role-policy --role-name "KarpenterControllerRole-${CLUSTER_NAME}" \
+    --policy-arn "arn:aws:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME}"