-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes Infiniband functionality in NetworkManager #25
Open
tessalatedSapote
wants to merge
1
commit into
clearlinux-pkgs:main
Choose a base branch
from
tessalatedSapote:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tessalatedSapote
force-pushed
the
main
branch
from
November 10, 2022 07:24
327729f
to
31c7ec9
Compare
clrpackages
pushed a commit
that referenced
this pull request
Nov 22, 2022
Enable IP-over-Infiniband connections to be managed by NetworkManager See: clearlinux/distribution#2770 #25 (comment)
github-actions bot
pushed a commit
to vicamo/clearlinux-pkgs_linux
that referenced
this pull request
Jun 21, 2023
[ Upstream commit 37c3b9fa7ccf5caad6d87ba4d42bf00be46be1cf ] The cited commit adds a compeletion to remove dependency on rtnl lock. But it causes a deadlock for multiple encapsulations: crash> bt ffff8aece8a64000 PID: 1514557 TASK: ffff8aece8a64000 CPU: 3 COMMAND: "tc" #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418 #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898 #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8 #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core] #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core] clearlinux-pkgs#7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core] clearlinux-pkgs#8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core] clearlinux-pkgs#9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core] clearlinux-pkgs#10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core] clearlinux-pkgs#11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core] clearlinux-pkgs#12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8 clearlinux-pkgs#13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower] clearlinux-pkgs#14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower] clearlinux-pkgs#15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047 clearlinux-pkgs#16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31 clearlinux-pkgs#17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853 clearlinux-pkgs#18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835 clearlinux-pkgs#19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27 clearlinux-pkgs#20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245 clearlinux-pkgs#21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482 clearlinux-pkgs#22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a clearlinux-pkgs#23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2 clearlinux-pkgs#24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2 clearlinux-pkgs#25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f clearlinux-pkgs#26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c crash> bt 0xffff8aeb07544000 PID: 1110766 TASK: ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9" #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418 #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88 #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core] #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core] #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core] clearlinux-pkgs#7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c clearlinux-pkgs#8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012 clearlinux-pkgs#9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d clearlinux-pkgs#10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f After the first encap is attached, flow will be added to encap entry's flows list. If neigh update is running at this time, the following encaps of the flow can't hold the encap_tbl_lock and sleep. If neigh update thread is waiting for that flow's init_done, deadlock happens. Fix it by holding lock outside of the for loop. If neigh update is running, prevent encap flows from offloading. Since the lock is held outside of the for loop, concurrent creation of encap entries is not allowed. So remove unnecessary wait_for_completion call for res_ready. Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update") Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Vlad Buslov <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes issues issues with using Infiniband in NetworkManager previously mentioned in duplicate on the distribution github issues and on the forum. More broadly, this enables Infiniband's connected mode for IPOIB and allows its configuration through NetworkManager. These IPOIB module configs are standard in RHEL and derivatives, Debian, Arch, and others too.