using rwsem lock as inode_lock to avoid deadlock #13736

Donny9 · 2024-09-29T13:50:08Z

Summary

using rwsem lock as inode_lock to avoid deadlock
deadlock examples:
When executing df -h on Core A to view mount information, this process will traverse inode nodes, thereby holding the inode_lock. Since the inode type of the mount point may be rpmsgfs, it will fetch staffs information from another Core B.
Meanwhile, rcS on Core B needs to obtain file information from Core A, which will be achieved by fetching stat information through rpmsgfs. When this message arrives at Core A, a deadlock can occur between Core A's rptun ap and nsh task.
However, both of these places involve read operations only, thus a reader-writer lock can be utilized to prevent such a deadlock.
simply inode_lock code, remove return value check
support recursive write for same process in sem_rw lock

Impact

Bug fix

Testing

Vela

nuttxpr · 2024-09-29T14:05:00Z

[Experimental Bot, please feedback here]

Let's analyze if this Pull Request (PR) description fulfills the NuttX requirements:

Strengths:

Addresses a Deadlock: The PR tackles a critical issue—a deadlock scenario. This is clearly a significant bug fix, directly enhancing the stability of NuttX.
Provides a Root Cause Explanation: The description explains the deadlock's origin, involving interactions between inode traversal, rpmsgfs, and Core A/B communication. This clarity is valuable for reviewers.
Solution Rationale: Justifies the use of a reader-writer (rwsem) lock as a solution. It correctly points out that the deadlock arises from concurrent read operations, making rwsem a suitable mitigation.
Mentions Additional Changes: Briefly notes code simplification (removing return value checks) and recursive write support. While needing more detail, it shows awareness of broader impacts.

Weaknesses:

Impact Section Lacking: The "Impact" section is overly brief. It correctly identifies the change as a bug fix but needs to elaborate on the potential ramifications as the PR description template requests.
Testing Details Insufficient: "Testing: Vela" is insufficient. The requirements demand:
- Specifics about the testing environment: Which build hosts (OS, CPU, compiler) and targets (architecture, board, configuration) were used?
- Testing logs: Provide "before" and "after" logs to demonstrate the deadlock's presence initially and its resolution after the change.
Code Simplification and Recursion Under-explained: The description mentions simplifying inode_lock code and adding recursive write support but lacks detail. These changes deserve elaboration:
- Code Simplification: What specific code was simplified? Why was the return value check unnecessary?
- Recursive Writes: Why was this support added? Does it introduce any potential risks?

Verdict & Recommendations:

This PR description is a good starting point but does not fully meet NuttX requirements.

Here's how to improve it:

Expand the Impact Section: Address all the points outlined in the template, even if the answer is "NO" (with a brief explanation). For example, "Impact on build (will build process change)? NO – The changes are localized to locking mechanisms and do not affect the build process."
Provide Thorough Testing Information:
- List all build host and target environments used for validation.
- Include relevant "before" and "after" testing logs. These logs are crucial for demonstrating the deadlock and its successful resolution.
Explain Code Simplification: Briefly describe the removed code and why its return value check was redundant.
Justify Recursive Write Support: Explain the rationale behind adding this feature and address any potential downsides or complexities it might introduce.

By addressing these points, you'll create a more robust and informative PR description, increasing the likelihood of a smoother review and acceptance.

Signed-off-by: dongjiuzhu1 <[email protected]>

fs/vfs/fs_pseudofile.c

fs/shm/shmfs.c

xiaoxiang781216 · 2024-10-01T04:02:05Z

fs/inode/fs_inodefind.c

-      return ret;
-    }
-
+  inode_lock();


inode_rlock

In some scenarios, inode_find and inode_remove are used in combination, such as in shm_open and shm_unlink. The remove operation requires a write lock. If we were to switch to a read lock in inode_find, then a lock would need to be added in inode_remove, and additional checks would be required elsewhere.

let's create a patch to change rwsem which promote rlock to wlock if the same thread already hold wlock.

fs/inode/fs_inodeaddref.c

Example: When executing "df -h" on Core A to view mount information, this process will traverse inode nodes, thereby holding the inode_lock. Since the inode type of the mount point may be rpmsgfs, it will fetch statfs information from another Core B. Meanwhile, rcS on Core B needs to obtain file information from Core A, which will be achieved by fetching stat information through rpmsgfs. When this message arrives at Core A, a deadlock can occur between Core A's rptun ap and nsh task. However, both of these places involve read operations only, thus a reader-writer lock can be utilized to prevent such a deadlock. Signed-off-by: dongjiuzhu1 <[email protected]>

Signed-off-by: dongjiuzhu1 <[email protected]>

github-actions bot added Area: File System File System issues Area: OS Components OS Components issues Size: M The size of the change in this PR is medium labels Sep 29, 2024

Donny9 force-pushed the inode_rw branch from 4878d21 to adf7dd3 Compare September 29, 2024 14:00

Donny9 force-pushed the inode_rw branch from adf7dd3 to 6c4d366 Compare September 30, 2024 06:22

sched/semaphore: add sem_rw source file to CMakeLists

2f7383d

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 6c4d366 to 661a393 Compare September 30, 2024 09:17

sched/semaphore: support recursive write for same process in sem_rw lock

947ee41

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 661a393 to 26ec205 Compare September 30, 2024 13:38

xiaoxiang781216 reviewed Oct 1, 2024

View reviewed changes

Donny9 added 2 commits October 1, 2024 22:17

fs/inode: remove unnecessary return value for inode_addrefs

792ff4d

Signed-off-by: dongjiuzhu1 <[email protected]>

Donny9 force-pushed the inode_rw branch from 26ec205 to 792ff4d Compare October 1, 2024 14:38

xiaoxiang781216 merged commit b2e69b8 into apache:master Oct 1, 2024
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using rwsem lock as inode_lock to avoid deadlock #13736

using rwsem lock as inode_lock to avoid deadlock #13736

Donny9 commented Sep 29, 2024

nuttxpr commented Sep 29, 2024

xiaoxiang781216 Oct 1, 2024

Donny9 Oct 1, 2024

xiaoxiang781216 Oct 1, 2024

using rwsem lock as inode_lock to avoid deadlock #13736

using rwsem lock as inode_lock to avoid deadlock #13736

Conversation

Donny9 commented Sep 29, 2024

Summary

Impact

Testing

nuttxpr commented Sep 29, 2024

xiaoxiang781216 Oct 1, 2024

Choose a reason for hiding this comment

Donny9 Oct 1, 2024

Choose a reason for hiding this comment

xiaoxiang781216 Oct 1, 2024

Choose a reason for hiding this comment