-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helper and corelens module to show unsubmitted pending works. #140
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The helper looks great. One question about the function name, why is it marked as "UNKNOWN"?
Thanks @biger410 for taking a look. The "UNKNOWN" appears as function name because I had not loaded the module symbols. We try with prog.symbol to get the function name and if prog.symbol returns LookupError, we put "UNKNOWN" in function name |
drgn_tools/workqueue.py
Outdated
class UnsubmittedPendingWorkModule(CorelensModule): | ||
"""Show pending but unsubmitted works""" | ||
|
||
name = "unsubmitted_pending_works" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This corelen module is only for unexpired delayed_work from offlined cpu, how about name it as "offlined_delayed_works"?
Also please enhancement the docstring of it to make sure user know this is for delayed_work from offline cpu and it will never expire until the cpu is online again, maybe remove "pending" and "unsubmitted" in the string, user may not know what that mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done these changes. Could you please have a look to see if docstring looks good now
… works. dealyed_work(s) get their pending bit set but are actually submitted to a workqueue, upon expiration of corresponding timer(s). Recently we have found some cases where a delayed work submitted to an already offlined CPU was never getting executed, because underlying timers were not firing in first place. Since the pending bit was set, this gave a notion that work item was lost to workqueue subsystem (which was not the case here.) Add an helper and a corelens module to dump delayed_work(s) whose timer has not yet expired. This is off interest for offline CPUs mainly, because ideally we should not see any delayed_work timer lying on an offlined CPU. So by default the helper and corelens module dump this info for offlined CPUs only like shown in the below snippet: python3 -m drgn_tools.corelens vmcore -d ~/v5.4/ -M unsubmitted_pending_works CPU: 4 state: offline timer: ffff8ce6bd7b3a40 tte(jiffies): 289126 work: ffff8ce6bd7b3a20 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b39e0 tte(jiffies): 289125 work: ffff8ce6bd7b39c0 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b3980 tte(jiffies): 289125 work: ffff8ce6bd7b3960 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b3920 tte(jiffies): 289125 work: ffff8ce6bd7b3900 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b38c0 tte(jiffies): 289124 work: ffff8ce6bd7b38a0 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b3860 tte(jiffies): 289124 work: ffff8ce6bd7b3840 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b3800 tte(jiffies): 289124 work: ffff8ce6bd7b37e0 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b37a0 tte(jiffies): 289124 work: ffff8ce6bd7b3780 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b3740 tte(jiffies): 289124 work: ffff8ce6bd7b3720 func: UNKNOWN: 0xffffffffc0327000 timer: ffff8ce6bd7b36e0 tte(jiffies): 289124 work: ffff8ce6bd7b36c0 func: UNKNOWN: 0xffffffffc0327000 Signed-off-by: Imran Khan <[email protected]>
66ded33
to
a9b967d
Compare
Signed-off-by: Imran Khan <[email protected]>
a9b967d
to
003bdbb
Compare
Add helper and corelens module to show unsubmitted pending works.
dealyed_work(s) get their pending bit set but are actually submitted to a workqueue,
upon expiration of corresponding timer(s).
Recently we have found some cases where a delayed work submitted to an already
offlined CPU was never getting executed, because underlying timers were not
firing in first place. Since the pending bit was set, this gave a notion that
work item was lost to workqueue subsystem (which was not the case here.)
Add an helper and a corelens module to dump delayed_work(s) whose timer has
not yet expired. This is off interest for offline CPUs mainly, because ideally
we should not see any delayed_work timer lying on an offlined CPU. So by default
the helper and corelens module dump this info for offlined CPUs only