From 6ea6132622c4c9df67aa6c465e6870b67faffebc Mon Sep 17 00:00:00 2001 From: Jonathan Innis Date: Fri, 15 Sep 2023 12:57:59 -0700 Subject: [PATCH] docs: [RFC] Add v1beta1 API RFC (#4274) Co-authored-by: Tim Bannister --- designs/v1beta1-api.md | 355 +++++++++++++++++++++++++++++ designs/v1beta1-full-changelist.md | 201 ++++++++++++++++ 2 files changed, 556 insertions(+) create mode 100644 designs/v1beta1-api.md create mode 100644 designs/v1beta1-full-changelist.md diff --git a/designs/v1beta1-api.md b/designs/v1beta1-api.md new file mode 100644 index 000000000000..706e19394da5 --- /dev/null +++ b/designs/v1beta1-api.md @@ -0,0 +1,355 @@ +# Karpenter v1beta1 APIs + +This document formalizes the [v1beta1 laundry list](https://github.com/aws/karpenter/issues/1327) and describes the high-level migration strategy for users moving to v1beta1. It shows the full API specs, including Group/Kind names and label names. This document does not go into explicit details on each of the individual changes in v1beta1. For details on these individual changes, see [Karpenter v1beta1 Full Change List](./v1beta1-full-changelist.md). + +## Bake Time + +API changes create a user migration burden that should be weighed against the benefits of the breaking changes. Batching breaking changes into a single version bump **helps to minimize this burden**. The v1alpha5 API has seen broad adoption over the last year, and resulted in a large amount of feedback. We see this period to have been a critical maturation process for the Karpenter project, and has given us confidence that the changes in v1beta1 will be sufficient to promote after a shorter feedback period. + +## Migration + +Kubernetes custom resources have built-in support for API version compatibility. CRDs with multiple versions must define a “storage version”, which controls the data stored in etcd. Other versions are views onto this data and converted using [conversion webhooks](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#webhook-conversion). However, there is a fundamental limitation that [all versions must be safely round-trippable through each other](https://book.kubebuilder.io/multiversion-tutorial/api-changes.html)[.](https://book.kubebuilder.io/multiversion-tutorial/api-changes.html) This means that it must be possible to define a function that converts a v1alpha5 Provisioner into a v1beta1 Provisioner and vise versa. + +Unfortunately, multiple proposed changes in v1beta1 are not round-trippable. Below, we propose deprecations of legacy fields in favor more modern mechanisms that have seen adoption in v1alpha5. These changes remove sharp edges that regularly cause users surprises and production pain. + +To workaround the limitation of round-trippability, we are proposing a rename of the Kinds (`NodePool`, `NodeClaim`, and `EC2NodeClass`) that the CRDs exist within. This allows both CRDs to exist alongside each other simultaneously and gives users a natural migration path to move through. + +### Migration Path + +Below describes a few migration paths at a high-level. These paths are not comprehensive, but offer good guidance through which users might migrate between the v1alpha5 APIs and the v1beta1 APIs. + +#### Periodic Rolling with Drift + +For each Provisioner in your cluster, perform the following actions: + +1. Create a NodePool/NodeClass in your cluster that is the v1beta1 equivalent of the v1alpha5 Provisioner/AWSNodeTemplate +2. Add a taint to the old Provisioner such as `karpenter.sh/legacy=true:NoSchedule` +3. Karpenter drift will mark all machines/nodes owned by that Provisioner as drifted +4. Karpenter drift will launch replacements for the nodes in the new NodePool resource + 1. Currently, Karpenter only supports rolling of one node at a time, which means that it may take some time for Karpenter to completely roll all nodes under a single Provisioner + +#### Forced Deletion + +For each Provisioner in your cluster, perform the following actions: + +1. Create a NodePool/NodeClass in your cluster that is the v1beta1 equivalent of the v1alpha5 Provisioner/AWSNodeTemplate +2. Delete the old Provisioner with `kubectl delete provisioner --cascade=foreground` + 1. Karpenter will delete each Node that is owned by the Provisioner, draining all nodes simultaneously and will launch nodes for the newly pending pods as soon as the Nodes enter a draining state + +#### Manual Rolling + +For each Provisioner in your cluster, perform the following actions: + +1. Create a NodePool/NodeClass in your cluster that is the v1beta1 equivalent of the v1alpha5 Provisioner/AWSNodeTemplate +2. Add a taint to the old Provisioner such as `karpenter.sh/legacy=true:NoSchedule` +3. Delete each node one-at-time owned by the Provisioner by running `kubectl delete node ` + +## APIs + +To help clearly define where configuration should live within Karpenter’s API, we should define the logical boundary between each Kind in the project. + +1. `NodePool` + 1. Neutral Node configuration-based fields that affect the **compatibility between Nodes and Pods during scheduling** (e.g. requirements, taints, labels) + 2. Neutral behavior-based fields for configuring Karpenter’s scheduling and deprovisioning decision-making +2. `EC2NodeClass` + 1. Cloudprovider-specific Node configuration-based fields that affect launch and bootstrap process for that Node including: configuring startup scripts, volume mappings, metadata settings, etc. + 2. Cloudprovider-specific behavior-based fields for configuring Karpenter’s scheduling and deprovisioning decision-making (e.g. interruption-based disruption, allocation strategy) +3. `NodeClaim` + 1. A Karpenter management object that fully manages the lifecycle of a single node including: configuring and launching the node, monitoring the node health (including disruption conditions), and handling the deprovisioning and termination of the node + +With these boundaries defined, below shows each API, with all fields specified, with values filled in as examples. + +### `karpenter.sh/NodePool` + +``` +apiVersion: karpenter.sh/v1beta1 +kind: NodePool +metadata: + name: default +spec: + template: + metadata: + labels: + billing-team: my-team + annotations: + example.com/owner: "my-team" + spec: + nodeClass: + name: default + kind: EC2NodeClass + apiVersion: karpenter.k8s.aws/v1beta1 + taints: + - key: example.com/special-taint + effect: NoSchedule + startupTaints: + - key: example.com/another-taint + effect: NoSchedule + requirements: + - key: "karpenter.k8s.aws/instance-category" + operator: In + values: ["c", "m", "r"] + resources: + requests: + cpu: "1" + memory: "100Mi" + kubeletConfiguration: + clusterDNS: ["10.0.1.100"] + containerRuntime: containerd + systemReserved: + cpu: 100m + memory: 100Mi + ephemeral-storage: 1Gi + kubeReserved: + cpu: 200m + memory: 100Mi + ephemeral-storage: 3Gi + evictionHard: + memory.available: 5% + nodefs.available: 10% + nodefs.inodesFree: 10% + evictionSoft: + memory.available: 500Mi + nodefs.available: 15% + nodefs.inodesFree: 15% + evictionSoftGracePeriod: + memory.available: 1m + nodefs.available: 1m30s + nodefs.inodesFree: 2m + evictionMaxPodGracePeriod: 60 + imageGCHighThresholdPercent: 85 + imageGCLowThresholdPercent: 80 + cpuCFSQuota: true + podsPerCore: 2 + maxPods: 20 + disruption: + consolidateAfter: 10m + consolidationPolicy: WhenEmpty | WhenUnderutilized + expireAfter: 30d + weight: 10 + limits: + cpu: "1000" + memory: 1000Gi +status: + resources: + cpu: "2" + memory: "100Mi" + ephemeral-storage: "100Gi" +``` + +### `karpenter.k8s.aws/EC2NodeClass` + +``` +apiVersion: karpenter.k8s.aws/v1beta1 +kind: EC2NodeClass +metadata: + name: default +spec: + amiFamily: AL2 + amiSelectorTerms: + - tags: + key: value + - id: abc-123 + - name: foo + owner: amazon + subnetSelectorTerms: + - tags: + compute.k8s.aws/discovery: cluster-name + - id: subnet-1234 + securityGroupSelectorTerms: + - tags: + compute.k8s.aws/discovery: cluster-name + - name: default-security-group + role: karpenter-node-role + userData: | + echo "this is custom user data" + tags: + custom-tag: custom-value + metadataOptions: + httpEndpoint: enabled + httpProtocolIPv6: disabled + httpPutResponseHopLimit: 2 + httpTokens: required + blockDeviceMappings: + - deviceName: /dev/xvda + ebs: + volumeSize: 20Gi + volumeType: gp3 + encrypted: true + detailedMonitoring: true +status: + subnets: + - id: subnet-0a462d98193ff9fac + zone: us-east-2b + - id: subnet-0322dfafd76a609b6 + zone: us-east-2c + - id: subnet-0727ef01daf4ac9fe + zone: us-east-2b + - id: subnet-00c99aeafe2a70304 + zone: us-east-2a + - id: subnet-023b232fd5eb0028e + zone: us-east-2c + - id: subnet-03941e7ad6afeaa72 + zone: us-east-2a + securityGroups: + - id: sg-041513b454818610b + name: ClusterSharedNodeSecurityGroup + - id: sg-0286715698b894bca + name: ControlPlaneSecurityGroup-1AQ073TSAAPW + amis: + - id: ami-05a05e85b17bb60d7 + name: amazon-eks-node-1.24-v20230703 + requirements: + - key: karpenter.k8s.aws/instance-accelerator-count + operator: DoesNotExist + - key: kubernetes.io/arch + operator: In + values: + - amd64 + - key: karpenter.k8s.aws/instance-gpu-count + operator: DoesNotExist + - id: ami-0d849ef1e65103147 + name: amazon-eks-gpu-node-1.24-v20230703 + requirements: + - key: kubernetes.io/arch + operator: In + values: + - amd64 + - key: karpenter.k8s.aws/instance-accelerator-count + operator: Exists + - id: ami-0d849ef1e65103147 + name: amazon-eks-gpu-node-1.24-v20230703 + requirements: + - key: kubernetes.io/arch + operator: In + values: + - amd64 + - key: karpenter.k8s.aws/instance-gpu-count + operator: Exists + - id: ami-0c3487f30d003deb3 + name: amazon-eks-arm64-node-1.24-v20230703 + requirements: + - key: kubernetes.io/arch + operator: In + values: + - arm64 + - key: karpenter.k8s.aws/instance-gpu-count + operator: DoesNotExist + - key: karpenter.k8s.aws/instance-accelerator-count + operator: DoesNotExist +``` + +### `karpenter.sh/NodeClaim` + +``` +apiVersion: karpenter.sh/v1beta1 +kind: NodeClaim +metadata: + name: default + labels: + billing-team: my-team + annotations: + example.com/owner: "my-team" +spec: + nodeClass: + name: default + kind: EC2NodeClass + apiVersion: compute.k8s.aws/v1beta1 + taints: + - key: example.com/special-taint + effect: NoSchedule + startupTaints: + - key: example.com/another-taint + effect: NoSchedule + requirements: + - key: "karpenter.k8s.aws/instance-category" + operator: In + values: ["c", "m", "r"] + resources: + requests: + cpu: "1" + memory: "100Mi" + kubeletConfiguration: + clusterDNS: ["10.0.1.100"] + containerRuntime: containerd + systemReserved: + cpu: 100m + memory: 100Mi + ephemeral-storage: 1Gi + kubeReserved: + cpu: 200m + memory: 100Mi + ephemeral-storage: 3Gi + evictionHard: + memory.available: 5% + nodefs.available: 10% + nodefs.inodesFree: 10% + evictionSoft: + memory.available: 500Mi + nodefs.available: 15% + nodefs.inodesFree: 15% + evictionSoftGracePeriod: + memory.available: 1m + nodefs.available: 1m30s + nodefs.inodesFree: 2m + evictionMaxPodGracePeriod: 60 + imageGCHighThresholdPercent: 85 + imageGCLowThresholdPercent: 80 + cpuCFSQuota: true + podsPerCore: 2 + maxPods: 20 +status: + allocatable: + cpu: 1930m + ephemeral-storage: 17Gi + memory: 534108Ki + pods: "4" + capacity: + cpu: "2" + ephemeral-storage: 20Gi + memory: 942684Ki + pods: "4" + conditions: + - type: Drifted + status: "True" + severity: Warning + - status: "True" + type: Initialized + - status: "True" + type: Lanched + - status: "True" + type: Registered + - status: "True" + type: Ready + nodeName: ip-192-168-62-137.us-west-2.compute.internal + providerID: aws:///us-west-2a/i-08168021ae532fca3 +``` + +### Labels/Annotations + +#### `karpenter.sh` + +1. `karpenter.sh/nodepool` +2. `karpenter.sh/initialized` +3. `karpenter.sh/registered` +4. `karpenter.sh/capacity-type` +5. `karpenter.sh/do-not-disrupt` + +#### `karpenter.k8s.aws` + +1. `karpenter.k8s.aws/instance-hypervisor` +2. `karpenter.k8s.aws/instance-encryption-in-transit-supported` +3. `karpenter.k8s.aws/instance-category` +4. `karpenter.k8s.aws/instance-family` +5. `karpenter.k8s.aws/instance-generation` +6. `karpenter.k8s.aws/instance-local-nvme` +7. `karpenter.k8s.aws/instance-size` +8. `karpenter.k8s.aws/instance-cpu` +9. `karpenter.k8s.aws/instance-memory` +10. `karpenter.k8s.aws/instance-network-bandwidth` +11. `karpenter.k8s.aws/instance-gpu-name` +12. `karpenter.k8s.aws/instance-gpu-manufacturer` +13. `karpenter.k8s.aws/instance-gpu-count` +14. `karpenter.k8s.aws/instance-gpu-memory` +15. `karpenter.k8s.aws/instance-accelerator-name` +16. `karpenter.k8s.aws/instance-accelerator-manufacturer` +17. `karpenter.k8s.aws/instance-accelerator-count` diff --git a/designs/v1beta1-full-changelist.md b/designs/v1beta1-full-changelist.md new file mode 100644 index 000000000000..4d22b625bcaf --- /dev/null +++ b/designs/v1beta1-full-changelist.md @@ -0,0 +1,201 @@ +# Karpenter v1beta1 Full Change List + +This document formalizes the [v1beta1 laundry list](https://github.com/aws/karpenter/issues/1327) into a full change list of items that are coming in the migration from the `v1alpha5` APIs of Karpenter to `v1beta1`. This document purely describes the necessary changes and the rationale behind the changes. For the high-level overview of the API specs, view the [Karpenter v1beta1 Graduation](./v1beta1-api.md) design doc. + +### Update Kind/Group Naming + +As part of the bump to v1beta1, to allow the v1alpha5 APIs to exist alongside the v1beta1 APIs while users go through a migration process, the following kind names are being proposed: + + 1. `Provisioner` → `NodePool` + 2. `Machine` -> `NodeClaim` + 3. `AWSNodeTemplate` → `EC2NodeClass` + +We see the renames as opportunities to better align our API groups and kinds with upstream concepts as well as reducing confusion between other Kubernetes API concepts. Specifically, the word `Provisioner` (on its own) has become overloaded in Kubernetes, [particularly in the area of storage provisioning](https://kubernetes.io/docs/concepts/storage/storage-classes/#the-storageclass-resource). We want to get completely away from this naming, while also prefixing all of our kinds that apply to nodes with `Node` for better alignment and clarity across the project. + +This gives the following naming to API types within the Karpenter project + +1. `karpenter.sh/NodePool` +2. `karpenter.sh/NodeClaim` +3. `karpenter.k8s.aws/EC2NodeClass` + +### Remove Validation/Mutating Webhooks in favor of CEL (Common Expression Language) + +The Karpenter maintainer team has seen an increase in the number of issues related to its webhooks ([#4415](https://github.com/aws/karpenter/issues/4415), [#3598](https://github.com/aws/karpenter/issues/3598), [#2902](https://github.com/aws/karpenter/issues/2902), [#4154](https://github.com/aws/karpenter/issues/4154), [#4106](https://github.com/aws/karpenter/issues/4016), [#3224](https://github.com/aws/karpenter/issues/3224), [#1729](https://github.com/aws/karpenter/issues/1729), ...) which lead us to believe that we should look for alternatives/ways to remove these webhooks from the project. + +Kubernetes 1.23 introduced the `CustomResourceValidationExpressions` in alpha, followed by graduating the feature to beta in 1.25. This feature introduces the ability to write CRD validation expressions directly in the CRD OpenAPISpec without any need for validating webhooks to do custom validation. EKS supports CEL starting in Kubernetes version 1.25. + +Karpenter v1beta1 will introduce CEL into the CRD OpenAPISpec while maintaining the webhooks until support for EKS versions <= 1.24 is dropped. At this point, we will drop support for the webhooks and rely solely on CEL for validation. + +### Label Changes + +### `karpenter.sh/do-not-evict` → `karpenter.sh/do-not-disrupt` + +Karpenter validates disruption across NodeClaims and determines which NodeClaims/Nodes it is allowed to disrupt as part of the disruption flow. While eviction is part of the termination process, it’s more accurate to say that the `karpenter.sh/do-not-evict` annotation actually prevents Karpenter’s disruption of the NodeClaim/Node rather than the eviction of it. + +### `karpenter.sh/do-not-consolidate` → `karpenter.sh/do-not-disrupt` + +Karpenter currently surfaces the `karpenter.sh/do-not-consolidate` annotation to block consolidation actions against individual nodes without having to make changes to the owning provisioner. We have found this is useful for users that have one-off scenarios for blocking consolidation, including debugging failures on nodes. + +While this feature is useful for consolidation, it should be expanded out to all disruption mechanisms, so that we have both pod-level and node-level control to block disruption using the `karpenter.sh/do-not-disrupt` annotation. + +### `NodePool` Changes + +#### `spec` → `spec.template` + +Currently fields that control node properties, such as `Labels`, `Taints`, `StartupTaints`, `Requirements`, `KubeletConfiguration`, `ProviderRef,` are top level members of `provisioner.spec`. We can draw a nice line between: + +1. Behavior-based fields that dictate how Karpenter should act on nodes +2. Configuration-based fields that dictate how NodeClaims/Nodes should look + +In this case, behavior-based fields will live in the top-level of the `spec` of the `NodePool` and configuration-based fields live within the `spec.template`. + +On top of this, this interface is very similar to the Deployment/StatefulSet/Job relationship, where a top-level object spawns templatized versions of lower-level objects. In our case, this top-level object is the `NodePool` and the lower-level object is the `NodeClaim` (with the `Node` joining the cluster as a side-effect of the `NodeClaim`). + +``` +spec: + weight: ... + limits: ... + template: + metadata: + labels: ... + annotations: ... + spec: + taints: ... + startupTaints: ... + requirements: ... + providerRef: ... + disruption: + expireAfter: ... + consolidateAfter: ... + consolidationPolicy: ... +``` + +#### `spec.ttl...` → `spec.disruption...` + +Karpenter plans to expand the amount of control that it gives users over both the aggressiveness of disruption and when disruption can take place. As part of these upcoming changes, more fields within the `NodePool` API will begin to pertain to the disruption configuration. + +We can better delineate the fields that specifically pertain to this configuration from the other fields in the `spec` (global behavior-based fields, provisioning-specific fields, node static configuration fields) by moving these fields inside a `disruption` block. This will make it clearer to users which configuration options specifically pertain to scale-down when they are configuring their `NodePool` CRs. + +#### `spec.ttlSecondsAfterEmpty` → `spec.disruption.consolidationPolicy` + +Currently, Karpenter has two mutually exclusive ways to deprovision nodes based on emptiness: `ttlSecondsAfterEmpty` and `consolidation`. If users are using `ttlSecondsAfterEmpty`, we have generally seen that users are configuring this field in one of two ways: + +1. `ttlSecondsAfterEmpty=0` → Users want to delete nodes as soon as they go empty and Karpenter sees that they are empty +2. `ttlSecondsAfterEmpty >> 0` → Users want to delete nodes that are empty but want to reduce the amount of node churn as a result of high pod churn on a larger cluster + +We anticipate that both of these scenarios can be captured through the consolidation disruption mechanism; however, we understand that there are use-cases where a user may want to reduce the aggressiveness of Karpenter disruption and only disrupt empty nodes. In this case, a user can configure the `consolidationPolicy` to be `WhenEmpty` which will tell the consolidation disruption mechanism to only deprovision empty nodes through consolidation. Alternatively, you can specify a `consolidationPolicy` of `WhenUnderutilized` which will allow consolidation to deprovision both empty and underutilized nodes. + +If `consolidationPolicy` is not set, Karpenter will implicitly default to `WhenUnderutilized`. + +#### `spec.ttlSecondsAfterEmpty` → `spec.disruption.consolidateAfter` + +While the `consolidationPolicy` offers one mechanism for users to control the aggressiveness of disruption, users that enable a `consolidationPolicy` of `WhenEmpty` or `WhenUnderutilized` may still want to dictate the speed at which nodes are deemed underutilized. This is particularly true on clusters that are large and have a large amount of pod churn. To support this, Karpenter will surface a `consolidateAfter` field which will allow users to define a per-node TTL to define the time that Karpenter can begin disrupting the node after first seeing that the node is eligible for consolidation. + +#### `spec.ttlSecondsUntilExpired` → `spec.disruption.expireAfter` + +Karpenter will change the `ttlSecondsUntilExipred` field to `expireAfter` to align with the `consolidateAfter` field in the `disruption` block. + +#### Remove `spec.provider` + +We’ve recommended that customers leverage `spec.providerRef` in favor of `spec.provider` since Q2 2022. Documentation for this feature has been removed since Q3 2022. We will take the opportunity to remove the feature entirely to minimize code bugs/complexity and user confusion. + +### `EC2NodeClass` Changes + +#### Update `spec.amiSelector` + +The alpha API `amiSelector` has two primary limitations that restrict user’s ability to specify the AMIs that they want Karpenter to use: + +1. Users can only specify “ANDed” together requirements, meaning that if a user has an orthogonal set of tags that they want to match their images to, they have to specify them by `aws::ids` directly, since there is no way with the current tag-selection logic to specify those values +2. Users want more flexibility to do things like specify a name/owner combination for images. Users have generally been asking Karpenter to more closely adhere to the EC2 APIs in our amiSelector design so that users can use more built-in filtering for AMIs, instead of having to use custom tagging to achieve the same outcome + 1. To support *some* of these use-cases, Karpenter has begun effectively creating “system-tags” i.e. (`aws::ids`, `aws::owners`, `aws::name`). These are special-cased version of the standard user custom-tags that allow users to achieve the scenarios described in #2; however, they are not easily discoverable or understood and if we are beginning to support special-cases like this, it makes sense that we should begin to structure these fields. + +``` +amiSelectorTerms: +- name: foo + id: abc-123 + owner: amazon + tags: + key: value +# Selector Terms are ORed +- name: foo + id: abc-123 + owner: self + tags: + key: value +``` + +#### Update `spec.subnetSelector` + +`subnetSelectorTerms` should have a similar parity to the `amiSelectorTerms` in its design to improve the ease-of-use for users. As a result, we should design the `subnetSelectorTerms` in the same spirit as the `amiSelectorTerms` such that you can also specify multiple selectors through `tags` and `ids` that can be ORed together to produce the ultimate set of items that you want to use. + +``` +subnetSelectorTerms: +- id: abc-123 + tags: + key: value +# Selector Terms are ORed +- id: abc-123 + tags: + key: value +``` + +#### Update `spec.securityGroupSelector` + +The same logic for `subnetSelectorTerms` applies to `securityGroupSelectorTerms`. We should have a similar parity to the `amiSelectorTerms` to improve the ease-of-use around this selector. + +``` +securityGroupSelectorTerms: +- id: abc-123 + tags: + use: private-subnet +# Selector Terms are ORed +- name: custom-security-group-b # not the same as the "Name" tag + tags: + use: private-subnet +- tags: + use: private-subnet + Name: custom-security-group-c # not the same as the "name" field +``` + +#### Remove `spec.launchTemplate` + +Direct launch template support is problematic for many reasons, outlined in the design [Unmanaged LaunchTemplate Support for Karpenter](./unmanaged-launch-template-removal.md). Customers continue to run into issues when directly using launch templates. Rather than continue to maintain these sharp edges and give users a half-baked experience of Karpenter, we should remove this field, considering that we can always add it back later if there is enough ask from users to do so. + +#### `spec.instanceProfile` → `spec.role` + +Currently, Karpenter uses an `instanceProfile` in the `AWSNodeTemplate` that is referenced to determine the profile that the EC2 node should launch with. Instance profiles are IAM entities that are specific to EC2 and do not have a lot of detail built around them (including console support); users are generally more familiar with the concept of IAM roles. As a result, we can support a `role` in the new `EC2NodeClass` and allow Karpenter to provision the instance profile `ad-hoc` with the `role` specified attached to it. + +#### Remove tag-based AMI Requirements + +[Tag-based AMI requirements](https://karpenter.sh/docs/concepts/node-templates/#ami-selection) allowed users to tag their AMIs using EC2 tags to express “In” requirements on the images they selected on. This would allow a user to specify that a given AMI should be used *only* for a given instance type, instance size, etc. The downside of this feature is that there is no way to represent “NotIn”-based requirements in the current state, which means that there is no way to *exclude* an instance type, size, etc. from using a different AMI. + +#### Example + +Take the following example with AMI “a” and AMI “b”: + +1. AMI "a" + 1. Tagged with `node.kubernetes.io/instance-type: c5.large` +2. AMI “b” + 1. No tags + +If Karpenter were to launch a “c5.xlarge” in this example, I would be guaranteed to get AMI “b”, since AMI “a” does not satisfy the compatability requirement for the instance type; however, if Karpenter were to launch a “c5.large”, this instance type satisfies both AMI “a” and AMI “b”, meaning that which AMI it chooses could fluctuate based on the creation dates of the selected AMIs. + +This functionality of Karpenter hasn’t been surfaced widely at this point in time and the **current state of the feature is effectively unusable and not well-tested**. We should remove this feature and consider adding a `requirements` key as part of the `spec.amiSelector` logic at some time in the future if users require this kind of requirement-based logic. + +### `karpenter-global-settings` Changes + +#### Deprecate `defaultInstanceProfile` in `karpenter-global-settings` + +InstanceProfile, SubnetSelector, and SecurityGroup are all required information to launch nodes. Currently InstanceProfile is set in default settings, but subnetSelector and securityGroupSelector aren't. This is awkward and [doesn't provide a consistent experience for users](https://github.com/aws/karpenter/issues/2973). We should align all of our configuration at the `EC2NodeClass` and `Provisioner` -level for users to streamline their experience. + +#### Deprecate `tags` from `karpenter-global-settings` in favor of `nodeClass.spec.tags` + +Having `tags` inside of the `karpenter-global-settings` makes it difficult to detect drift when these tag values are changed. Since the primary reason this field exists inside the `karpenter-global-settings` is for ease-of-use, and there is a simple workaround for customers (setting consistent tags inside each `EC2NodeClass`), it makes natural sense to remove this from the `karpenter-global-settings` . + +#### Remove `aws.enablePodENI` from `karpenter-global-settings` + +This value has no meaning anymore now that our initialization logic does not rely on it. This can be pulled out of the `karpenter-global-settings` without causing impact to users. + +#### Deprecate `aws.enableENILimitedPodDensity` in `karpenter-global-settings` + +Setting static pod density is available through the `nodePool.spec.kubeletConfiguration.maxPods` so there is no need for this setting to be configured at a global level anymore.