-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHIP-0039: Build Scheduler Options #205
SHIP-0039: Build Scheduler Options #205
Conversation
Provisional proposal to add API options to control where build pods are scheduled. These will rely on core Kubernetes features related to pod scheduling. This proposal was partially motivated by the multi-arch feature discussions, where it was revealed that we currently have no means of controlling where build pods are scheduled. While these features may support a future proof of concept for multi-arch builds, orchestrating multi-arch builds end to end is out of scope.
- Removed the goal to set pod scheduling options at the cluster level. This adds significant scope since the build controller would need to support this configuration via a file or large set of environment variables. This goal is not ruled out completely - I'm recommending it as a follow up feature so the community can iterate and deliver quickly. - Added support for custom schedulers - this feature has been in Kubernetes for quite some time, though its adoption is somewhat limited due to the challenges of writing a custom scheduler. Signed-off-by: Adam Kaplan <[email protected]>
Affinity and anti-affinity is an incredibly rich and complex API in Kubernetes. Most of the other scheduler tuning APIs in this SHIP are simpler to understand and can be wired directly to underlying APIs (Tekton today, potentially another implementation in the future). Exposing the entire affinity rule API will make it challenging for developers to utilize. We should instead strive for a simpler API that makes it easy for developers to express their intent. This abstraction can then be translated to the underlying scheduler APIs. Such an effort is not trivial and warrants a separate feature/SHIP. Signed-off-by: Adam Kaplan <[email protected]>
These changes complete all sections of the SHIP needed to reach the "Implementable" state. This included some cleanup in the top summary, concrete implementations of new API fields with their respective types, and justification for implementing our own Tolerations API. As part of thinking through risks, mitigations, and drawbacks, I recommend that the Shipwright community publish a hardening guide alongside this feature's implementation. Exposing node selectors and tolerating taints can risk system availability, especially if builds are scheduled on control plane nodes. Kubernetes tries to prevent this by default through node labels and taints. There are some scenarios (ex: single node or small clusters) where scheduling builds on the control plane is desirable or necessary; publishing a hardening guide preserves flexibility while enabling informed choices for security. The drawbacks and alternatives also call out potential future concerns related to multi-arch support in Shipwright. This is under active discussion, and SHIP-0039 may add additional complexity to any future design. Signed-off-by: Adam Kaplan <[email protected]>
@SaschaSchwarze0 @qu1queee updated proposal to reach the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
|
||
## Open Questions [optional] | ||
|
||
- Should this be enabled always? Should we consider an alpha -> beta lifecycle for this feature? (ex: off by default -> on by default) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never did that on a per-feature base so far. I know Tekton is doing that. I am not really in support of this. One groups arbitrary independent features into a group (called alpha) that one can either all enable or not. I like more the way Knative does it (https://knative.dev/docs/serving/configuration/feature-flags) on a per-feature base. Though, we never did that in Shipwright so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that I prefer knative-style individual feature flags to "all or nothing." Is this worth a separate SHIP, or just discussing/implementing as a feature within builds?
|
||
### Implementation Notes | ||
|
||
#### API Updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not it be also very valid that the build strategy author defines those. It knows which platforms are supported for by the build tool(s) that are used in the strategy. A first-class property "supported architectures" on the strategy would be very interesting for this also thinking forward towards multi-arch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this worth considering in the SHIP that (should) come out of our discussion in shipwright-io/build#1119. I'm less inclined to add nodeSelector
and related APIs to the build strategy. As you identified (and I noted as a drawback) - these scheduler APIs leak the internals of the cluster. Adding these APIs to the build strategy could make it harder to share strategies across clusters.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SaschaSchwarze0 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@ramessesii2: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Changes
ProvisionalImplementable proposal to add API options to control where build pods are scheduled. These will rely on core Kubernetes features related to pod scheduling.This proposal was partially motivated by the multi-arch feature discussions, where it was revealed that we currently have no means of controlling where build pods are scheduled. While these features may support a future proof of concept for multi-arch builds, orchestrating multi-arch builds end to end is out of scope.
Submitter Checklist
See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.
Release Notes