Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consolidationPolicy: Underweight #1829

Open
koreyGambill opened this issue Nov 19, 2024 · 6 comments
Open

Add consolidationPolicy: Underweight #1829

koreyGambill opened this issue Nov 19, 2024 · 6 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@koreyGambill
Copy link

Description

What problem are you trying to solve?
We've created fallback on-demand NodePools with lower scheduling weight than our spot instance NodePools (AWS). When spot instances are hard to find, Karpenter schedules our (fallback) on-demand ec2, but it never consolidates back to the spot instances so it ends up being really expensive. I would love an official setting that allows Karpenter to consolidate based on weighted preferences rather than just utilization.

In this feature, if all the pods on a low-weight node are compatible with a higher-weight node, Karpenter should work to create the higher-weight node and re-schedule the pods. For us, it would help reduce costs, but in general it makes sense that users would care about using higher weighted nodes. I would expect this to still obey the consolidateAfter setting.

Something like this could work in the yaml

disruption:
    # Changed to a list type for the purpose of clearer yaml now that there are 3 options
    consolidationPolicy: 
      # If Underutilized and Underweight are set, Karpenter will re-schedule
      # the node if some pods can be put on a higher weight node, and the 
      # rest could fit on other existing nodes of the same weight
      - Empty
      - Underutilized
      - Underweight  # This would allow consolidating if all pods could be put on a higher weight node

How important is this feature to you?
Low-Medium - I have a workaround (setting the on-demand NodePool to expire after 4hrs), but it has a couple drawbacks.

  1. We are waiting up to 4hrs to get back to our optimal state
  2. We cannot re-use the NodePool for workloads that need a long lifespan. With this feature it would be possible since Karpenter wouldn't be able to reconsolidate the node if those pods were on it (due to taints/tolerations/affinities).
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@koreyGambill koreyGambill added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 19, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 19, 2024
@jonathan-innis
Copy link
Member

In general, this is the way that Karpenter should perform. Can you walk through the exact scenario that you are seeing? When Karpenter performs its simulations, it's going to consider the highest weight NodePool first (which should be your spot NodePool). Once it finds a scheduling decision, so long as the newer instance type is cheaper, it will consolidate it. All of this should be true in your scenario where you are moving back from on-demand to spot.

@jonathan-innis
Copy link
Member

/triage needs-information

@k8s-ci-robot k8s-ci-robot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 10, 2024
@koreyGambill
Copy link
Author

The behavior we observed was that the lower weighted nodes (which were the on-demand nodes) would never be consolidated. Over time, we end up with more lower weighted nodes than higher weighted nodes since they don't experience spot terminations. So most of our nodes are actually the on-demand nodes which have a lower weight.

I wouldn't expect the consolidation to work as is, though. Our consolidationPolicy is set to "Empty". It would seem like a bug to me then, if Karpenter was consolidating those nodes for budgetary reasons. We cannot use the consolidationPolicy "Underutilized" because it results in a massive turnover of nodes which costs a ton through AWS Config costs.

I'm proposing a 3rd reason to consolidate, which would be if a higher weighted node would be available to schedule on. Which, seems distinctly different to me than "Underutilized" which should consolidate if the total CPU/Mem requested could be reduced by shuffling pods and shutting down nodes.

Copy link

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 25, 2024
@mariuskimmina
Copy link

mariuskimmina commented Dec 25, 2024

@koreyGambill I feel we have been running into a similar situation, what seems to work for us at the moment is to use WhenEmpty on the nodepool that has only spot instances and WhenUnderutilized on the nodepool that serves as on-demand fallback. This way we avoid disruption if the workloads are running on spot but allow them if they are scheduled on-demand.

That said, we have also been feeling like a consolidation policy that captures the case of "I have both spot and on-demand instances in this nodepool and I want to only ever consolidate on-demand nodes back to spot instances" is missing.

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2024
Copy link

github-actions bot commented Jan 9, 2025

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants