Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helper and corelens module to show unsubmitted pending works. #140

Merged
merged 2 commits into from
Jan 15, 2025

Conversation

imran-kn
Copy link
Contributor

Add helper and corelens module to show unsubmitted pending works.

dealyed_work(s) get their pending bit set but are actually submitted to a workqueue,
upon expiration of corresponding timer(s).
Recently we have found some cases where a delayed work submitted to an already
offlined CPU was never getting executed, because underlying timers were not
firing in first place. Since the pending bit was set, this gave a notion that
work item was lost to workqueue subsystem (which was not the case here.)

Add an helper and a corelens module to dump delayed_work(s) whose timer has
not yet expired. This is off interest for offline CPUs mainly, because ideally
we should not see any delayed_work timer lying on an offlined CPU. So by default
the helper and corelens module dump this info for offlined CPUs only

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Dec 25, 2024
Copy link
Member

@biger410 biger410 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper looks great. One question about the function name, why is it marked as "UNKNOWN"?

@imran-kn
Copy link
Contributor Author

imran-kn commented Jan 8, 2025

The helper looks great. One question about the function name, why is it marked as "UNKNOWN"?

Thanks @biger410 for taking a look. The "UNKNOWN" appears as function name because I had not loaded the module symbols. We try with prog.symbol to get the function name and if prog.symbol returns LookupError, we put "UNKNOWN" in function name

drgn_tools/workqueue.py Show resolved Hide resolved
class UnsubmittedPendingWorkModule(CorelensModule):
"""Show pending but unsubmitted works"""

name = "unsubmitted_pending_works"
Copy link
Member

@biger410 biger410 Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This corelen module is only for unexpired delayed_work from offlined cpu, how about name it as "offlined_delayed_works"?
Also please enhancement the docstring of it to make sure user know this is for delayed_work from offline cpu and it will never expire until the cpu is online again, maybe remove "pending" and "unsubmitted" in the string, user may not know what that mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done these changes. Could you please have a look to see if docstring looks good now

… works.

dealyed_work(s) get their pending bit set but are actually submitted to a workqueue,
upon expiration of corresponding timer(s).
Recently we have found some cases where a delayed work submitted to an already
offlined CPU was never getting executed, because underlying timers were not
firing in first place. Since the pending bit was set, this gave a notion that
work item was lost to workqueue subsystem (which was not the case here.)

Add an helper and a corelens module to dump delayed_work(s) whose timer has
not yet expired. This is off interest for offline CPUs mainly, because ideally
we should not see any delayed_work timer lying on an offlined CPU. So by default
the helper and corelens module dump this info for offlined CPUs only like shown
in the below snippet:

python3 -m drgn_tools.corelens vmcore -d ~/v5.4/ -M unsubmitted_pending_works
CPU: 4 state: offline
timer: ffff8ce6bd7b3a40 tte(jiffies): 289126 work: ffff8ce6bd7b3a20 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b39e0 tte(jiffies): 289125 work: ffff8ce6bd7b39c0 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b3980 tte(jiffies): 289125 work: ffff8ce6bd7b3960 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b3920 tte(jiffies): 289125 work: ffff8ce6bd7b3900 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b38c0 tte(jiffies): 289124 work: ffff8ce6bd7b38a0 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b3860 tte(jiffies): 289124 work: ffff8ce6bd7b3840 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b3800 tte(jiffies): 289124 work: ffff8ce6bd7b37e0 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b37a0 tte(jiffies): 289124 work: ffff8ce6bd7b3780 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b3740 tte(jiffies): 289124 work: ffff8ce6bd7b3720 func: UNKNOWN: 0xffffffffc0327000
timer: ffff8ce6bd7b36e0 tte(jiffies): 289124 work: ffff8ce6bd7b36c0 func: UNKNOWN: 0xffffffffc0327000

Signed-off-by: Imran Khan <[email protected]>
@biger410 biger410 merged commit 38b3777 into main Jan 15, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants