-
Notifications
You must be signed in to change notification settings - Fork 960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add networkInterfaces configuration to launchTemplate #4353
Conversation
✅ Deploy Preview for karpenter-docs-prod ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity. |
I've taken a first pass and then got side tracked by some other issues. I'll pull this down tomorrow and take it for a test drive. |
6cc4b49
to
c12aaaf
Compare
I've tested this out on my side and verified everything is updating properly with the LaunchTemplate, instance, and drift when changing between params within NetworkInterfaces! Nice! Glad we can get some functionality in so that we can build on this base! Remaining items are just to fix the DeviceIndex and Description passing to the NetworkInterfaces and that should also fix the test that is failing. |
great to hear 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
Looks like the network interfaces e2e are still failing:
|
5155663
to
a5558ac
Compare
Sorry about that; I did not manage to run the e2e tests myself because of a different environment setup. Previously I had to hack around it and introduce changes to Makefile and test runner, which soon became outdated with upstream changes. I hope the last commits have fixed it |
No worries! I'll kick it off! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
Ok, looks like the non-Integ e2e tests failed due to some transient infra issue. But the Integ tests now only have 2 failures, looking at those now:
|
9023362
to
4dfdbc1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
3009c77
to
8340b00
Compare
8340b00
to
44d2973
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/karpenter snapshot
Snapshot successfully published to |
Groups: lo.Map(options.SecurityGroups, func(s v1beta1.SecurityGroup, _ int) *string { return aws.String(s.ID) }), | ||
}, | ||
func (p *Provider) generateNetworkInterfaces(options *amifamily.LaunchTemplate) []*ec2.LaunchTemplateInstanceNetworkInterfaceSpecificationRequest { | ||
if len(options.NetworkInterfaces) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we should drop this behavior now, looking at the quoted issue it looks like the desired long term solution was adding these explicit fields. Thoughts @jonathan-innis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see us marking this behavior as deprecated and then choosing to drop this behavior entirely at v1. We can add this to the v1 laundry list that is getting tracked in this issue: #4993
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider adding an example to the examples/v1beta1
directory as well when we add the interfaces to the API
AssociatePublicIPAddress *bool `json:"associatePublicIPAddress,omitempty"` | ||
|
||
// A description for the network interface. | ||
Description *string `json:"description,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How necessary is the description here? This feels ancillary and probably something that we can wait to add until users ask for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really necessary, its used in tests to grab the correct network interface but that can be done with the device index.
@@ -312,6 +315,18 @@ type BlockDevice struct { | |||
VolumeType *string `json:"volumeType,omitempty"` | |||
} | |||
|
|||
type NetworkInterface struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that all of these values need to be pointers? I think an empty value is probably equal to not set? Typically, you need pointers when an empty value is still a valid value, but I'm not sure that that's the case for any of these fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to double check but I believe the zero value for AssociatePublicIPAddress
would not be an acceptable default. If it's not specified in the EC2NodeClass
we don't want to specify it in the launch template either that way we don't override the subnets behavior.
Groups: lo.Map(options.SecurityGroups, func(s v1beta1.SecurityGroup, _ int) *string { return aws.String(s.ID) }), | ||
}, | ||
func (p *Provider) generateNetworkInterfaces(options *amifamily.LaunchTemplate) []*ec2.LaunchTemplateInstanceNetworkInterfaceSpecificationRequest { | ||
if len(options.NetworkInterfaces) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could see us marking this behavior as deprecated and then choosing to drop this behavior entirely at v1. We can add this to the v1 laundry list that is getting tracked in this issue: #4993
env.EventuallyExpectHealthy(pod) | ||
env.ExpectCreatedNodeCount("==", 1) | ||
instance := env.GetInstance(pod.Spec.NodeName) | ||
for _, interfaceSpec := range interfaces { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we have a function that grabs the interfaces from the instance? Can we leverage this function rather than grabbing the instance and doing things with checking the fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look 👍
} | ||
} | ||
}, | ||
Entry("when a single interface is specified", &v1beta1.NetworkInterface{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each of these tests adds time. I like our bias for code coverage, but I think that we might be able to get away with testing the most complex scenario in the E2E test environment, condensing this down to a single test and then testing a bunch of different edge-case scenarios within the functional/unit testing
# optional, configures IMDS for the instance | ||
metadataOptions: | ||
httpEndpoint: enabled | ||
httpProtocolIPv6: disabled | ||
httpPutResponseHopLimit: 2 | ||
httpTokens: required | ||
|
||
# optional, configure network interfaces for the instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the networkInterfaces
header
This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity. |
Removing |
Can you provide some update on the current thought? |
We're hesitant to expand the |
We might be the odd man out however we do leverage attaching multiple network interfaces to our instances outside of Public IP and EFA, we use a custom CNI that expects a secondary ENI. This PR solves this however we could get around this issue. One thing I am curious about Karpenter's position is that the AWS provider variant of Karpenter essentially replaces ASGs and Launch Templates in AWS for us however there is a feature disparity between ASGs and Karpenter node management. Its true we may not need to match the functionalities however in my opinion the bias should be towards closing the disparity rather than away. This isn't specific to network interface support but rather as a whole how does/ should Karpenter think about what features it should be keeping from the components it is replacing vs dropping. |
@garvinp-stripe are you able to elaborate about your use-case, preferably in an issue where it's easier to track? We're certainly not against bringing Karpenter closer to feature parity with launch templates, but we want to drive those decisions by user need rather than adding features because they are in launch templates. In this case, the problems outlined in the original issue either didn't need the full scope or were solved through other mechanisms. If you have a use-case for this we're more than happy to continue the discussion. |
ok, I can work on that. but I will do a separate PR. |
Sounds good! I'll go ahead and close this PR out then. |
Fixes #2026
Description
a reduced version of #3819,
Add a networkInterfaces struct to the AWSNodeTemplate CRD, and use the parameters to create AWS launch templates.
How was this change tested?
unit and integration tests were created; also tested manually on a test cluster
Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.