You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@mkeeter points out to me that it could be quite useful for the MGS metrics subsystem to report a counter metric tracking the number of task crash dumps present on a SP. This would allow us to easily see which SPs on the rack have crash dumps, as well as providing an indication of the time1 at which a new crash dump appeared, which provides an approximation of when the task crash occurred.
This would probably be recorded under a new target, rather than the hardware_component target used by sensor metrics, as it's scoped to the whole SP rather than a particular hardware device known to the SP.
@mkeeter points out to me that it could be quite useful for the MGS metrics subsystem to report a counter metric tracking the number of task crash dumps present on a SP. This would allow us to easily see which SPs on the rack have crash dumps, as well as providing an indication of the time1 at which a new crash dump appeared, which provides an approximation of when the task crash occurred.
Footnotes
Wall clock time as understood by MGS. ↩
The text was updated successfully, but these errors were encountered: