SHIP-0039: Build Scheduler Options #205

adambkaplan · 2024-05-15T13:56:53Z

Changes

~~Provisional~~ Implementable proposal to add API options to control where build pods are scheduled. These will rely on core Kubernetes features related to pod scheduling.

This proposal was partially motivated by the multi-arch feature discussions, where it was revealed that we currently have no means of controlling where build pods are scheduled. While these features may support a future proof of concept for multi-arch builds, orchestrating multi-arch builds end to end is out of scope.

Submitter Checklist

Includes tests if functionality changed/was added
Includes docs if changes are user-facing
Set a kind label on this PR
Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

NONE

Provisional proposal to add API options to control where build pods are scheduled. These will rely on core Kubernetes features related to pod scheduling. This proposal was partially motivated by the multi-arch feature discussions, where it was revealed that we currently have no means of controlling where build pods are scheduled. While these features may support a future proof of concept for multi-arch builds, orchestrating multi-arch builds end to end is out of scope.

ships/0039-build-scheduler-opts.md

- Removed the goal to set pod scheduling options at the cluster level. This adds significant scope since the build controller would need to support this configuration via a file or large set of environment variables. This goal is not ruled out completely - I'm recommending it as a follow up feature so the community can iterate and deliver quickly. - Added support for custom schedulers - this feature has been in Kubernetes for quite some time, though its adoption is somewhat limited due to the challenges of writing a custom scheduler. Signed-off-by: Adam Kaplan <[email protected]>

Affinity and anti-affinity is an incredibly rich and complex API in Kubernetes. Most of the other scheduler tuning APIs in this SHIP are simpler to understand and can be wired directly to underlying APIs (Tekton today, potentially another implementation in the future). Exposing the entire affinity rule API will make it challenging for developers to utilize. We should instead strive for a simpler API that makes it easy for developers to express their intent. This abstraction can then be translated to the underlying scheduler APIs. Such an effort is not trivial and warrants a separate feature/SHIP. Signed-off-by: Adam Kaplan <[email protected]>

These changes complete all sections of the SHIP needed to reach the "Implementable" state. This included some cleanup in the top summary, concrete implementations of new API fields with their respective types, and justification for implementing our own Tolerations API. As part of thinking through risks, mitigations, and drawbacks, I recommend that the Shipwright community publish a hardening guide alongside this feature's implementation. Exposing node selectors and tolerating taints can risk system availability, especially if builds are scheduled on control plane nodes. Kubernetes tries to prevent this by default through node labels and taints. There are some scenarios (ex: single node or small clusters) where scheduling builds on the control plane is desirable or necessary; publishing a hardening guide preserves flexibility while enabling informed choices for security. The drawbacks and alternatives also call out potential future concerns related to multi-arch support in Shipwright. This is under active discussion, and SHIP-0039 may add additional complexity to any future design. Signed-off-by: Adam Kaplan <[email protected]>

adambkaplan · 2024-06-20T18:50:00Z

@SaschaSchwarze0 @qu1queee updated proposal to reach the Implementable state. Please take a look.

cc @apoorvajagtap @HeavyWombat

SaschaSchwarze0

/approve

SaschaSchwarze0 · 2024-06-21T06:45:48Z

ships/0039-build-scheduler-opts.md

+
+## Open Questions [optional]
+
+- Should this be enabled always? Should we consider an alpha -> beta lifecycle for this feature? (ex: off by default -> on by default)


We never did that on a per-feature base so far. I know Tekton is doing that. I am not really in support of this. One groups arbitrary independent features into a group (called alpha) that one can either all enable or not. I like more the way Knative does it (https://knative.dev/docs/serving/configuration/feature-flags) on a per-feature base. Though, we never did that in Shipwright so far.

Agree that I prefer knative-style individual feature flags to "all or nothing." Is this worth a separate SHIP, or just discussing/implementing as a feature within builds?

SaschaSchwarze0 · 2024-06-21T06:48:51Z

ships/0039-build-scheduler-opts.md

+
+### Implementation Notes
+
+#### API Updates


Would not it be also very valid that the build strategy author defines those. It knows which platforms are supported for by the build tool(s) that are used in the strategy. A first-class property "supported architectures" on the strategy would be very interesting for this also thinking forward towards multi-arch.

I think this worth considering in the SHIP that (should) come out of our discussion in shipwright-io/build#1119. I'm less inclined to add nodeSelector and related APIs to the build strategy. As you identified (and I noted as a drawback) - these scheduler APIs leak the internals of the cluster. Adding these APIs to the build strategy could make it harder to share strategies across clusters.

openshift-ci · 2024-06-21T06:53:08Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: SaschaSchwarze0

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [SaschaSchwarze0]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ramessesii2

/lgtm

openshift-ci · 2024-06-25T03:18:19Z

@ramessesii2: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

qu1queee

/lgtm

pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 15, 2024

openshift-ci bot requested review from HeavyWombat and otaviof May 15, 2024 13:56

adambkaplan added the kind/design Categorizes issue or PR as related to design. label May 15, 2024

SaschaSchwarze0 reviewed May 15, 2024

View reviewed changes

ships/0039-build-scheduler-opts.md Show resolved Hide resolved

ships/0039-build-scheduler-opts.md Outdated Show resolved Hide resolved

qu1queee requested review from qu1queee and removed request for otaviof May 16, 2024 14:19

This was referenced Jun 3, 2024

[FEATURE] Windows Containers/Node Support shipwright-io/build#1613

Open

June 10th, 2024 Community Meeting #209

Closed

adambkaplan added 2 commits June 14, 2024 19:51

SaschaSchwarze0 approved these changes Jun 21, 2024

View reviewed changes

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2024

adambkaplan mentioned this pull request Jun 24, 2024

June 24th, 2024 Community Meeting #211

Closed

ramessesii2 approved these changes Jun 25, 2024

View reviewed changes

qu1queee approved these changes Jun 26, 2024

View reviewed changes

openshift-ci bot assigned qu1queee Jun 26, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2024

qu1queee removed the request for review from HeavyWombat June 26, 2024 06:39

openshift-merge-bot bot merged commit afea4f8 into shipwright-io:main Jun 26, 2024
1 check passed

adambkaplan deleted the ship-0039-provisional branch June 26, 2024 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SHIP-0039: Build Scheduler Options #205

SHIP-0039: Build Scheduler Options #205

adambkaplan commented May 15, 2024 •

edited

Loading

adambkaplan commented Jun 20, 2024

SaschaSchwarze0 left a comment

SaschaSchwarze0 Jun 21, 2024

adambkaplan Jun 24, 2024

SaschaSchwarze0 Jun 21, 2024

adambkaplan Jun 24, 2024

openshift-ci bot commented Jun 21, 2024

ramessesii2 left a comment

openshift-ci bot commented Jun 25, 2024

qu1queee left a comment


		## Open Questions [optional]

		- Should this be enabled always? Should we consider an alpha -> beta lifecycle for this feature? (ex: off by default -> on by default)

SHIP-0039: Build Scheduler Options #205

SHIP-0039: Build Scheduler Options #205

Conversation

adambkaplan commented May 15, 2024 • edited Loading

Changes

Submitter Checklist

Release Notes

adambkaplan commented Jun 20, 2024

SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

SaschaSchwarze0 Jun 21, 2024

Choose a reason for hiding this comment

adambkaplan Jun 24, 2024

Choose a reason for hiding this comment

SaschaSchwarze0 Jun 21, 2024

Choose a reason for hiding this comment

adambkaplan Jun 24, 2024

Choose a reason for hiding this comment

openshift-ci bot commented Jun 21, 2024

ramessesii2 left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Jun 25, 2024

qu1queee left a comment

Choose a reason for hiding this comment

adambkaplan commented May 15, 2024 •

edited

Loading