You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I want to test the all_reduce_perf with p2p through PCIe in H20. However, H20 is equipped with nvlink, the NCCL all_reduce_perf always transfers data with the nvlink. How Can I get the p2p with PCIe and disable the nvlink in the test.
I tried to disable the nvlink with RMNvLinkEnable=0x0. Then the NCCL all_reduce_perf will always leverage the SHM to communicate.
The text was updated successfully, but these errors were encountered:
To the best of my knowledge, there's no way for NCCL to disable just nvlink. The granularity of control is "P2P" or "no P2P".
What does nvidia-smi topo -m print after you use RMNvLinkEnable? Perhaps the GPUs are simply too far from each other on the PCIe bus? NCCL will typically not attempt P2P if devices are any further from each other than PXB.
Hi, I want to test the all_reduce_perf with p2p through PCIe in H20. However, H20 is equipped with nvlink, the NCCL all_reduce_perf always transfers data with the nvlink. How Can I get the p2p with PCIe and disable the nvlink in the test.
I tried to disable the nvlink with
RMNvLinkEnable=0x0
. Then the NCCL all_reduce_perf will always leverage the SHM to communicate.The text was updated successfully, but these errors were encountered: