-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rdma device busId at scheduler #2250
base: main
Are you sure you want to change the base?
Conversation
@@ -545,12 +546,43 @@ func (p *Plugin) preBindObject(ctx context.Context, cycleState *framework.CycleS | |||
return nil | |||
} | |||
|
|||
var deviceAllocs *apiext.DeviceAllocations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we consider add move this logic to Reserve() phase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -89,6 +89,7 @@ type DeviceAllocation struct { | |||
Minor int32 `json:"minor"` | |||
Resources corev1.ResourceList `json:"resources"` | |||
Extension *DeviceAllocationExtension `json:"extension,omitempty"` | |||
BusID string `json:"busID,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer define it as this:
type DeviceAllocation struct {
Minor int32 `json:"minor"`
Resources corev1.ResourceList `json:"resources"`
Extension *DeviceAllocationExtension `json:"extension,omitempty"`
Topology *DeviceTopology `json:"topology,omitempty"`
}
type DeviceTopology struct {
// BusID is the domain:bus:device.function formatted identifier of PCI/PCIE device
BusID string `json:"busID,omitempty"`
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!BusId is one of the attributes of the topology structure and can be encapsulated in the topology structure!
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2250 +/- ##
==========================================
- Coverage 66.29% 66.02% -0.28%
==========================================
Files 453 453
Lines 53311 53411 +100
==========================================
- Hits 35344 35265 -79
- Misses 15419 15604 +185
+ Partials 2548 2542 -6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: 208824 <[email protected]>
…oordinator-nic into koord-scheduler-rdma
Signed-off-by: 208824 <[email protected]>
Ⅰ. Describe what this PR does
The RDMA device is mounted inside the container based on the scheduling assignment result.
Ⅱ. Does this pull request fix one issue?
The scheduling framework already supports joint allocation of Gpus and RDMA devices, but several semantics need to be tested in practice, including preference and samePCIE. In order to allow RDMA devices to access the container, we need to fix the lack of BDF addresses in the results assigned by the scheduling algorithm to RDMA devices
Ⅲ. Describe how to verify it
In the k8s cluster, prepare one or more servers that support rdma network adapters as cluster nodes. Install the new version of the koordlet component, koord-manager, and the revamped multus-cni on each node, and check the node status. The number of resources is displayed as the actual number of RDMA nics on the node.
Write a pod.YAML to apply for RDMA network card resources, kubectl apply-f pod. On pod note (device-allocated), check whether the rdma device bdf address (busID field) is included in the rdma allocation result, and check the accuracy.
When the pod is in the running state, enter the container and run the ifconfig command to check the number of network cards. If the case is correct, multiple network card lists will be output, such as net1, net2, net3, and so on.
Ⅳ. Special notes for reviews
Deploy components that support RDMA, including koordlet, koord-manager, koord-scheduler, and multus-cni.
Due to the complete end-to-end passthrough, multi-CNI plug-in is also required, which complies with CNI specifications and supports multi-NIC allocation. Assigning the RDMA network card PF/VF to the Pod mentioned here requires the device ID to be injected into the component, otherwise it will not work. This change will be maintained separately in the multi-cni project or another PR
V. Checklist
make test