-
Notifications
You must be signed in to change notification settings - Fork 960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Placement Group Support #5307
feat: Placement Group Support #5307
Conversation
✅ Deploy Preview for karpenter-docs-prod canceled.
|
NB: I still need to import and update the documentation, ensure that the AWSNodeTemplate object is updated as intended (since this case is a bit different than than the OG PR), ensure all comments on the old PR addressed, and do some practical testing. CI is passing. |
Regarding https://github.com/aws/karpenter-provider-aws/pull/4553/files#r1372675680 I agree that we shouldn't use multiple selectors for determining which Placement Group to use, and to go further I think we should allow users to define the spec of a Placement Group whose lifecycle would be tied to that of the The two configuration options could look like this: # define the placementgroup inline
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: example-one
spec:
# ...
placementGroup:
spec:
strategy: cluster # cluster | partition | spread
spreadLevel: host # host | rack IF strategy = spread
partitionCount: 1 # 1-7 | IF strategy = partition
---
# refer to a preexisting placement group
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: example-two
spec:
# ...
placementGroup:
selector:
matchLabels:
karpenter.sh/placementgroup: "main"
Name: "arbitrary-placementgroup-name"
A related comment: #4553 (comment) |
return err | ||
} | ||
if result != nil { | ||
nodeClass.Status.PlacementGroups = append(nodeClass.Status.PlacementGroups, *result.GroupArn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is relevant here, but its instructions for proceeding are unclear
https://github.com/aws/karpenter-provider-aws/pull/4553/files#r1405582171
I believe I have linked all relevant comments from the previous PR. I will proceed with work on the CRDs and documentation once outstanding questions are addressed. Thanks! |
type PlacementGroupSelectorTerm struct { | ||
// Name of the placement group to be selected | ||
// +optional | ||
Name string `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that it's possible to also use tags here as well. I wonder if we also support tagging initially similar to the other selectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that's nice about tagging and naming is that it works consistently across accounts, so if you have similar set-ups for different accounts, everything just works out-of-the-box.
This is the general philosophy that Karpenter has taken towards selectors in the project, ensuring that we can move Karpenter from one account to another or from one region to another and everything "just works."
The trade-off of opening up this selector to tags is that now multiple placement groups can be returned, so how do you pick between them? There probably needs to be a consistent ordering and the selection between them shouldn't be random, particularly when it comes to the interaction of this feature with the drift feature.
@@ -63,6 +63,9 @@ type EC2NodeClassStatus struct { | |||
// cluster under the AMI selectors. | |||
// +optional | |||
AMIs []AMI `json:"amis,omitempty"` | |||
// PlacementGroups contains the ec2 placement group arns | |||
// +optional | |||
PlacementGroups []string `json:"placementGroups,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we resolve the strategy into the status block here? If the placement group type is partition
, should we consider putting the number of partitions that we are using into the status as well?
@@ -167,6 +170,14 @@ type AMISelectorTerm struct { | |||
Owner string `json:"owner,omitempty"` | |||
} | |||
|
|||
// PlacementGroupSelectorTerm defines the selection logic for ec2 placement groups | |||
// that are used to launch nodes. If multiple fields are used for selection, the requirements are ANDed | |||
type PlacementGroupSelectorTerm struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a question here of how we should pass requirements down when using something like cluster
placement groups OR when using something like spread
with rack
.
When using the spread
with rack
, the EC2 documentation mentions that you can't launch more than 7 instances in a single AZ (https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html#:~:text=A%20rack%20spread%20placement%20group%20supports%20a%20maximum%20of%20seven%20running%20instances%20per%20Availability%20Zone), which means that we are restricted with our node limits when using this type of placement group. One option here is that we could constrain our requirements when using this type of placement group so that when we run out of instances to launch in a single AZ, we just get rid of that AZ from our requirements and start launching in other AZs. Another option here is that we allow to select on multiple placement groups and if we run out of space with one of the spread
placement groups we just move to the other one.
This handle differently, though, when we are dealing with cluster
or partition
placement groups where there is no limit on the number of instances that can be launched and it makes less sense to allow for multiple placement groups to be selected on.
Based on all of these considerations, I think we should think about writing up a small design that considers the use-cases around placement groups, how they meld with requirements and what the API surface should look like.
Pull Request Test Coverage Report for Build 7180793423Warning: This coverage report may be inaccurate.We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
💛 - Coveralls |
This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity. |
This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity. |
|
Fixes #3324
Description
Allows for the configuration of a Placement Group when defining an
AWSNodeTemplate
resource. Based on existing work by @preflightsiren on PR #4553.How was this change tested?
Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.