WIP: docs: Add RFC for multi-node consolidation partitioning #1547

cnmcavoy · 2024-08-09T19:57:53Z

Fixes #853

Description
Add RFC for multi-node consolidation partitioning

How was this change tested?
N/A

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Signed-off-by: Cameron McAvoy <[email protected]>

k8s-ci-robot · 2024-08-09T19:58:00Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cnmcavoy
Once this PR has been reviewed and has the lgtm label, please assign njtran for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coveralls · 2024-08-09T20:10:14Z

Pull Request Test Coverage Report for Build 10887856678

Details

0 of 0 changed or added relevant lines in 0 files are covered.
6 unchanged lines in 2 files lost coverage.
Overall coverage decreased (-0.04%) to 80.714%

Files with Coverage Reduction	New Missed Lines	%
pkg/test/expectations/expectations.go	2	94.73%
pkg/controllers/disruption/consolidation.go	4	87.25%

Totals
Change from base Build 10838752554:	-0.04%
Covered Lines:	8387
Relevant Lines:	10391

💛 - Coveralls

github-actions · 2024-08-24T12:01:29Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

njtran · 2024-09-16T16:37:00Z

designs/multi-node-consolidation-partitioning.md

+
+Multi-node consolidation should be partitioned into multiple consolidation actions, each of which is responsible for consolidating nodes that are homogenous in terms of cpu architecture and nodepool. This will allow for more successful consolidation actions, and will reduce the number of nodes in the cluster more effectively.
+
+A naive, simple approach would be to have Karpenter create M bins for multi-node consolidation, based on cpu architecture and the nodepool of the node. Multi-node consolidation would sort the bins by the number of nodes contained, and attempt multi-node consolidation on the nodes within each bin, starting with the largest, descending. After any solution is found, return the solution.


I think the issue is more around how we order our nodes in consolidation, right?

We order our nodes by the estimated cost to disrupt the pods on that node, and then find some contiguous group of nodes from 0 that are compatible. If the set of candidates are ordered in such a way that each alternating node has a different architecture, you'll never be able to get multi-node consolidation. How were you thinking this might impact that complication today?

Does the estimated disruption cost ordering need to be changed? The disruption cost is independent of whether the node is compatible to be consolidated with another node.

What I am proposing is to bin the nodes together with nodes they are compatible. This part is the hand-wavey, and a bit arbitrary; Karpenter has to decide where the boundaries for whether candidates should be consolidated. If Karpenter picks bad boundaries (current situation) then there are two many or two few bins, and consolidation fails. This is also where I am most interested in input... what are the boundaries for each bin that is roughly compatible. Just CPU architecture? CPU architecture and node pool? Maybe the node's operating system should be a boundary.

What I am proposing is to bin the nodes together with nodes they are compatible.

The problem is more about compatibility in between pods, right? I can see what you're saying that right now, every node is in its own bucket (no pre-defined ordering). We can either create arbitrary buckets/groupings and use that as a heuristic for grouping nodes in our consolidation ordering. The closest generalization for partitioning would be if we could group all nodes from the same nodepool together, but I think even then you're still hitting the same multi-arch issue on your NodePool, so you actually probably want to make the grouping separate from the NodePool.

If we just make it on CPU architecture, you probably solve most of the issues here, but I need to think more about how this might work in practice.

github-actions · 2024-10-04T12:01:54Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

github-actions · 2024-10-22T12:02:18Z

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

docs: Add RFC for multi-node consolidation partitioning

4bbc57d

Signed-off-by: Cameron McAvoy <[email protected]>

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 9, 2024

k8s-ci-robot requested review from jackfrancis and tallaxes August 9, 2024 19:58

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Aug 9, 2024

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 24, 2024

github-actions bot added the lifecycle/closed label Sep 8, 2024

github-actions bot closed this Sep 8, 2024

njtran reopened this Sep 16, 2024

njtran reviewed Sep 16, 2024

View reviewed changes

github-actions bot removed lifecycle/closed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 17, 2024

github-actions bot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 4, 2024

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: docs: Add RFC for multi-node consolidation partitioning #1547

WIP: docs: Add RFC for multi-node consolidation partitioning #1547

cnmcavoy commented Aug 9, 2024

k8s-ci-robot commented Aug 9, 2024

coveralls commented Aug 9, 2024 •

edited

Loading

github-actions bot commented Aug 24, 2024

njtran Sep 16, 2024

cnmcavoy Sep 19, 2024

njtran Oct 7, 2024

github-actions bot commented Oct 4, 2024

github-actions bot commented Oct 22, 2024


		Multi-node consolidation should be partitioned into multiple consolidation actions, each of which is responsible for consolidating nodes that are homogenous in terms of cpu architecture and nodepool. This will allow for more successful consolidation actions, and will reduce the number of nodes in the cluster more effectively.

		A naive, simple approach would be to have Karpenter create M bins for multi-node consolidation, based on cpu architecture and the nodepool of the node. Multi-node consolidation would sort the bins by the number of nodes contained, and attempt multi-node consolidation on the nodes within each bin, starting with the largest, descending. After any solution is found, return the solution.

WIP: docs: Add RFC for multi-node consolidation partitioning #1547

Are you sure you want to change the base?

WIP: docs: Add RFC for multi-node consolidation partitioning #1547

Conversation

cnmcavoy commented Aug 9, 2024

k8s-ci-robot commented Aug 9, 2024

coveralls commented Aug 9, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10887856678

Details

💛 - Coveralls

github-actions bot commented Aug 24, 2024

njtran Sep 16, 2024

Choose a reason for hiding this comment

cnmcavoy Sep 19, 2024

Choose a reason for hiding this comment

njtran Oct 7, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 4, 2024

github-actions bot commented Oct 22, 2024

coveralls commented Aug 9, 2024 •

edited

Loading