Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redfishpower: add extra timeout debug information #154

Merged
merged 4 commits into from
Mar 1, 2024

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Mar 1, 2024

Problem: When powering on a target, it will sometimes enter the "PoweringOn" state but regress back to the "off" state. This is likely due to a hardware problem. It can be hard to diagnose when the only clue is a power on has "timed out".

Add some additional output to help diagnose this situation.

some cleanups added on top

chu11 added 2 commits March 1, 2024 12:33
Problem: in powerman.dev(5) the ranged scripts are typoed
as "range".

Correct the typoes.
Problem: Some text describing the selection of the status_all
vs status scripts is invalid.

Simply remove the offending text.
Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - how do these changes interact with powerman? I.e. how does the user get the use ful information?

Comment on lines 374 to 376
/* strp - on, off, unknown
* realstrp - on, off, poweringon, poweringoff, unknown
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update comment for changed variable names?

chu11 added 2 commits March 1, 2024 14:35
Problem: There are times it would be convenient to know the
actual redfish power status returned, rather than the internally
mapped "on", "off", or "unknown" power status.

Update parse_onoff() to optionally return additional redfish power status.
The additional redfish power status returns are "paused", "powering off",
and "powering on".  Update variable names that call this function
to be consistent.
Problem: When powering on a target, it will sometimes enter the
"PoweringOn" state but regress back to the "off" state.  This is
likely due to a hardware problem.  It can be hard to diagnose when
the only clue is a power on has "timed out".

Add some additional output to help diagnose this situation.
@chu11 chu11 force-pushed the redfishpower_extra_debug branch from f4c4fd6 to fe49841 Compare March 1, 2024 22:35
@chu11
Copy link
Member Author

chu11 commented Mar 1, 2024

LGTM - how do these changes interact with powerman? I.e. how does the user get the use ful information?

At the moment they get it just via telemetry output. Not great, but it's something ... vs the nothing before this (I had to run redfishpower by hand in verbose mode and see what was going on).

Once we work on #79 / #85, perhaps some of it can migrate to the user.

@chu11
Copy link
Member Author

chu11 commented Mar 1, 2024

thanks, setting MWP

@mergify mergify bot merged commit e9376d6 into chaos:master Mar 1, 2024
8 checks passed
@chu11 chu11 deleted the redfishpower_extra_debug branch March 1, 2024 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants