Skip to content

Commit

Permalink
Doc refining.
Browse files Browse the repository at this point in the history
  • Loading branch information
nikmohan committed Jan 12, 2024
1 parent b15b381 commit abfa71f
Showing 1 changed file with 10 additions and 6 deletions.
16 changes: 10 additions & 6 deletions website/content/en/preview/concepts/disruption.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,17 +113,21 @@ Events:
Normal Unconsolidatable 33s (x3 over 30m) karpenter can't replace with a cheaper node
```
{{% alert title="Warning" color="warning" %}}
Using preferred anti-affinity and topology spreads can reduce the effectiveness of consolidation. At node launch, Karpenter attempts to satisfy affinity and topology spread preferences. In order to reduce node churn, consolidation must also attempt to satisfy these constraints to avoid immediately consolidating nodes after they launch. This means that consolidation may not disrupt nodes in order to avoid violating preferences, even if kube-scheduler can fit the host pods elsewhere. Karpenter reports these pods via logging to bring awareness to the possible issues they can cause (e.g. `pod default/inflate-anti-self-55894c5d8b-522jd has a preferred Anti-Affinity which can prevent consolidation`).
{{% /alert %}}
{{% alert title="Note" color="primary" %}}
#### Spot consolidation
For spot nodes, Karpenter has deletion consolidation enabled by default. If you would like to enable replacement with spot consolidation, you need to enable the feature through the [`SpotToSpotConsolidation` feature flag]({{<ref "../reference/settings#features-gates" >}}).
Cheaper spot instance types are selected with the [`price-capacity-optimized` strategy](https://aws.amazon.com/blogs/compute/introducing-price-capacity-optimized-allocation-strategy-for-ec2-spot-instances/) and often the cheapest spot instance type is not launched due to the likelihood of interruption; therefore, Karpenter uses the number of available instance type options cheaper than the currently launched spot instance as a heuristic for evaluating whether it should launch a replacement for the current spot node.
We refer to the number of instances that Karpenter has within its launch decision as a launch's "instance type flexibility." When Karpenter is considering performing a spot-to-spot consolidation replacement, performing a replacement from a single node to another node, it will check whether replacing the instance type will lead to enough instance type flexibility in the subsequent launch request. That is -- can Karpenter find enough cheaper options than the current instance type option that Karpenter can: 1) Be assured that it won't continually consolidate down to the cheapest spot instance which might have very poor availability and 2) Be assured that the launch with the new instance types will have enough flexibility that an instance type with good enough availability comparable to the current instance type will be chosen.
We refer to the number of instances that Karpenter has within its launch decision as a launch's "instance type flexibility." When Karpenter is considering performing a spot-to-spot consolidation replacement, performing a replacement from a single node to another node, it will check whether replacing the instance type will lead to enough instance type flexibility in the subsequent launch request. That is -- can Karpenter find enough cheaper options than the current instance type option that Karpenter can:
1) Be assured that it won't continually consolidate down to the cheapest spot instance which might have very poor availability and
2) Be assured that the launch with the new instance types will have enough flexibility that an instance type with good enough availability comparable to the current instance type will be chosen.
Karpenter requires a min instance type flexibility of 15 instance types when performing single node spot-to-spot consolidations (1 node to 1 node). It does not have the same instance type flexibility requirement for multi-node spot-to-spot consolidations (many nodes to 1 node) since doing so without requiring flexibility won't lead to "race to the bottom" scenarios.
{{% alert title="Warning" color="warning" %}}
Using preferred anti-affinity and topology spreads can reduce the effectiveness of consolidation. At node launch, Karpenter attempts to satisfy affinity and topology spread preferences. In order to reduce node churn, consolidation must also attempt to satisfy these constraints to avoid immediately consolidating nodes after they launch. This means that consolidation may not disrupt nodes in order to avoid violating preferences, even if kube-scheduler can fit the host pods elsewhere. Karpenter reports these pods via logging to bring awareness to the possible issues they can cause (e.g. `pod default/inflate-anti-self-55894c5d8b-522jd has a preferred Anti-Affinity which can prevent consolidation`).
{{% /alert %}}
### Drift
Drift handles changes to the NodePool/EC2NodeClass. For Drift, values in the NodePool/EC2NodeClass are reflected in the NodeClaimTemplateSpec/EC2NodeClassSpec in the same way that they’re set. A NodeClaim will be detected as drifted if the values in its owning NodePool/EC2NodeClass do not match the values in the NodeClaim. Similar to the upstream `deployment.spec.template` relationship to pods, Karpenter will annotate the owning NodePool and EC2NodeClass with a hash of the NodeClaimTemplateSpec to check for drift. Some special cases will be discovered either from Karpenter or through the CloudProvider interface, triggered by NodeClaim/Instance/NodePool/EC2NodeClass changes.
Expand Down

0 comments on commit abfa71f

Please sign in to comment.