-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preemption admission check controller. KEP update. #1178
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/assign @alculquicondor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
At every run the controller will get the list of workloads pending preemption. | ||
Since for some of these workloads is not necessary to issue eviction at that given point (eg. Having a pending check that uses AfterCheckPassedOrOnDemand policy) their quota reservation will be ignored. | ||
Fore every other preemption pending workloads, it will check if it can fit without evicting other workloads, case in which the preemption admission check condition will be set to `Ready`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fore every other preemption pending workloads, it will check if it can fit without evicting other workloads, case in which the preemption admission check condition will be set to `Ready`. | |
For every other preemption pending workloads, it will check if it can fit without evicting other workloads, case in which the preemption admission check condition will be set to `Ready`. |
Preemption can run before other checks have finished their transition to Ready. By the time every check has transitioned to Ready, we still need to check if we need more preemptions.
So, in a sense, there is no concept of "preemption ready", we need to continue checking until the workload is fully admitted. And once it fits, and the checks are all "ready", we can directly transition to admitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quota reserved by a workload is considered in use by the scheduler, if another workload gets is reservation after the fist one gets to "preemption ready" , that second workload will need to trigger it's own set of evictions and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(fore -> for done)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quota reserved by a workload is considered in use by the scheduler, if another workload gets is reservation after the fist one gets to "preemption ready" , that second workload will need to trigger it's own set of evictions and so on.
Yes, that I understand.
Let me state my thinking again. There are scenarios where preemption needs to run before other admission checks have completed, because other AdmissionChecks can request preemptions.
In that case, we will issue preemptions, but even when completed, we might still need preemptions later on, when all the AdmissionChecks actually complete.
That's why I say that there might not be a concept of "preemptions are ready".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preemption is finished when the workload can fit without the need of any additional evictions. Pass this point its quota is locked and can start execution, or wait more. The quote remains locked until this workload finishes or gets evicted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes me think that the existing QuotaReserved
condition will have different semantics. And the idea of QuotaReserved
could actually be used to signal that preemption is "ready".
But maybe that's a bit of over-engineering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... we could a yet another Quota specific condition like QuotaLocked
, to only be set when all the evictions are done, then change the Admission logic from QuoataReserved && AdmissionChecks==Ready
to QuoataLocked && AdmissionChecks==Ready
.
The preemption controller can then manage workloads that have QuoataReserved but not QuoataLocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would QuotaReserved be equivalent to "preemption ready" then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other way around, will have less impact on the code, The scheduler creates the admission struct and sets the QuotaReserved, the preemption controller will set the new condition when no evictions are needed. The workload controller will set the Admitted condition when the new condition is true and all Admission checks are ready. (sure we can add a fast path in the scheduler so if at reservation time the wl fits without the need of evictions set both conditions)
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: trasc The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any implications of this update to https://github.com/kubernetes-sigs/kueue/tree/main/keps/582-preempt-based-on-flavor-order ?
No, the changes done by that only have effect before quota reservation. |
I think one important question is whether the preemption controller needs to be an admission check. In some scenarios, the preemption check is running only once all the other admission checks have passed. This has the implication that now no other admission check can do the same, otherwise they will block each other. So that's an argument against it being an admission check. I'll give another pass to the proposal tomorrow and think about it more. |
- Set its admission check to `Ready` | ||
- Add it to the snapshot | ||
- If it cannot fit | ||
- Get an updated list of eviction candidates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please clarify which ones are candidates? Is it only workloads that are fully Admitted? I don't think you want to consider ones that only have ResourceQuota.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The candidates are picked from the list of workloads that hold a quota reservation, just because maybe a admission check of a workload is transitioning slowly to ready dose not mean that it should be able to hold its quota against a higher priority workload.
- If the updated list is empty, meaning the preemption cannot be done. | ||
- Set its admission check to `Retry`, the quota reservation will be lost and the workload placed in the queue waiting for a new QuotaReservation. | ||
|
||
**NOTE** The list of candidates is picked out from the list of workloads holding a QuotaReservation, regardless if they are fully Admitted or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is QuotaReservation enough? Consider jobs waiting for ProvReq to be fulfilled could be waiting for hours. There is no point in preempting them yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mwielgus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ProvRequest Admission check can be configured with AfterCheckPassedOrOnDemand
policy, the evictions will be done only when the provisioning is finished (and only if still needed).
In the future , if we can know that the provisioning request has no chance to succeed for long period of times, the ProvRecAC can set its state to Retry this will fully release the quota.After that time the state can be set to Pending to put wl back in the queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is more about the candidates for eviction. Obviously, fully Admitted workloads are candidates for eviction.
Other than that, I think it should be only the ones that have checks ready or are requesting preemption themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be a source of priority inversion. Say we have wl1 (p=1, reserving, one check pending), wl2 (p=2, fully admited), wl3(p=3, needs to evict wl1 or wl2), We should not evict wl2 just because wl1 has some pending admission check.
@trasc: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind documentation
What this PR does / why we need it:
Preemption admission check controller. KEP update.
Which issue(s) this PR fixes:
Fixes #1677
Special notes for your reviewer:
Does this PR introduce a user-facing change?