-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webhook: automatically insert the rdma resource to pod by DRA claim #3670
Conversation
605d5b7
to
7663fa2
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3670 +/- ##
==========================================
- Coverage 81.68% 79.30% -2.39%
==========================================
Files 50 51 +1
Lines 4391 4494 +103
==========================================
- Hits 3587 3564 -23
- Misses 643 773 +130
+ Partials 161 157 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
pkg/podmanager/pod_webhook.go
Outdated
ClientSet *kubernetes.Clientset | ||
} | ||
|
||
func New(enableDra bool, clientSet *kubernetes.Clientset) *PodWebhook { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 pkg 下 有 NewPodManager , xxxx
因此, 这个 New 要 调整下,随意了点
NewPodWebHook ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
然后 作为 一个 完整的 pkg, 强烈 建议 把 新加逻辑 合入 PodManager interface , 而不是 新来一个 PodWebhook struct, 这样后续 可以最大的实现 相互之间的 代码复用
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 podManager 以前是作为 get/list pod 的tools, 太多地方引用,比如agent。如果合入进来会对 agent 造成干扰
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webhook 的逻辑应该只在 controller 里面
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这也是目前其他所有代码位置把manager 和 wehbook 分开的原因
pkg/podmanager/pod_webhook.go
Outdated
var _ webhook.CustomValidator = &PodWebhook{} | ||
|
||
// Default implements admission.CustomDefaulter. | ||
func (pw *PodWebhook) Default(ctx context.Context, obj runtime.Object) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要单位测试
pkg/podmanager/pod_webhook.go
Outdated
return "" | ||
} | ||
|
||
func (pw *PodWebhook) injectRdmaResourceToPod(resourceMap map[string]bool, pod *corev1.Pod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数中的逻辑 是 你想的,还是 从其他地方借鉴的 有最佳实践的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是我自己想的,没找到可借鉴的地方
pkg/podmanager/pod_webhook.go
Outdated
for resource, found := range resourceMap { | ||
if !found { | ||
// inject the resource to the pod.containers[0].resources.requests | ||
pod.Spec.Containers[0].Resources.Requests[corev1.ResourceName(resource)] = k8s_resource.MustParse("1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里缺少 一行重要的 info 级别注入日志
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在调用函数之前已经log过了,这里不想在函数内部再次 log,对单元测试来说比较麻烦,而且会在 调用函数之后 show pod 的 yaml
没有 E2E ,至少 需要 把 E2E 设计 提到 doc |
3c361de
to
aad9b28
Compare
pkg/podmanager/pod_manager.go
Outdated
} | ||
|
||
// try to find the resource in container resources.requests | ||
if _, ok := c.Resources.Requests[corev1.ResourceName(resource)]; ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果 pod 之前有 resource ,但是 values 已经不是 最新的了(不正确),这里应该要纠正 ?如果不纠正,需要有个日志
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不应该纠正,无法定义 pod.resource 是否正确,可能是用户有意为之
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果是 用户改了 parameter 这边的定义呢 ?然后重启pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
有道理,但这里不太好区分 是用户自己改了 parameter 还是手动改了 pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里还要思考一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用户是不会手动 改 pod,只会 改 deployment , 但是 判断 pod 和 deployment 的 resource 是否一致 , 又有绝大 代码成本,且控制器类型很多
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不管用户改 deploy 还是 pod,对于webhook 来说都是改 pod
397294b
to
2df3d2a
Compare
553cb0b
to
7f26f0b
Compare
pkg/podmanager/pod_manager.go
Outdated
return nil | ||
} | ||
|
||
func (pw *podManager) getStaticNicsFromSpiderClaimParameter(ctx context.Context, pod *corev1.Pod) ([]spiderpoolv2beta1.StaticNic, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个api 应该是在 spiderDra 之类的 目录下
pkg/podmanager/pod_manager.go
Outdated
return []spiderpoolv2beta1.StaticNic{}, nil | ||
} | ||
|
||
func (pw *podManager) getResourceMapFromStaticNics(ctx context.Context, staticNics []spiderpoolv2beta1.StaticNic) (map[string]bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个api 应该是在 spiderDra 之类的 目录下
pkg/podmanager/pod_manager.go
Outdated
} | ||
|
||
// resourceName return the resourceName for given spiderMultusConfig | ||
func (pw *podManager) resourceName(smc *spiderpoolv2beta1.SpiderMultusConfig) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个api 应该是在 spiderDra 之类的 目录下
pkg/podmanager/pod_manager.go
Outdated
return "" | ||
} | ||
|
||
func InjectRdmaResourceToPod(resourceMap map[string]bool, pod *corev1.Pod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 放到 本目录的 utils.go 比较合适
仍然在排查 webhook 流程问题:
|
the server could not find the requested resource ---- is spiderpool parameter CRD not registerred ? |
just go ahead next Monday |
No, all spiderpool CRD has been registered, this webhook is just for pods, the error is extraordinary |
pkg/podmanager/pod_manager.go
Outdated
@@ -238,3 +271,161 @@ func (pm *podManager) GetPodTopController(ctx context.Context, pod *corev1.Pod) | |||
UID: podOwner.UID, | |||
}, nil | |||
} | |||
|
|||
// Default implements admission.CustomDefaulter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to add some comments above the function to indicate what the webhook will handle , it will be clear as the feature grows
// whebhook is for:
// 1. when dra is enabled, it insert RDMA resource name to the pod
// ....
7f26f0b
to
598b3b1
Compare
@weizhoublue 我建议这个webhook 只对带有某些 label 的 pod 生效(比如 resources.spidernet.io/webhook-resources: true),不然可能会影响其他 pod 甚至是 apiserver 的正常启动, 测试过程也发现了这个问题 |
598b3b1
to
15db81c
Compare
@@ -144,6 +144,40 @@ webhooks: | |||
- spidercoordinators | |||
sideEffects: None | |||
{{- end }} | |||
{{- if .Values.dra.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 不需要 用户判断,由 controller 代码动态 patch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
什么情况下patch
5c4bcdc
to
cef2040
Compare
117ee64
to
d16287c
Compare
d16287c
to
f65c809
Compare
eb7a24a
to
31efbc2
Compare
Signed-off-by: cyclinder <[email protected]>
31efbc2
to
914374c
Compare
dup with #4239 |
Thanks for contributing!
What type of PR is this?
What this PR does / why we need it:
webhook: auto inject the rdma resource to pod.
pod.resourceClaims -> spiderclaimparameters.staticNics -> spidermultusconfig -> resourceName -> pod.resource.requests.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer: