-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding monitoring on slashing metrics #166
Comments
Monitoring already available: https://github.com/status-im/infra-hq/blob/3b0e8240fb83fd84d1a19f43f2d0de45b0ded316/ansible/roles/prometheus-slave/files/rules/nimbus.yml#L6. I couldn't find im the VictorOps history if the alert was correctly created |
What about metrics? Has the count risen during the Holesky fleet slashing? |
Okay, I see how this works, there's a `` label which shows the whole count for the node:
And rest is for per-validator using beginning of public key. But we don't need to look at that, it's easier to look at the total. The issue with Which means that alone is not useful. We need to detect an increase in that metric, not just above zero value. |
For example, here we can see that the count just goes up:
But here we can see it's just a spike when we use
If we incease the time range to 15 minutes we can see more spikes: But it doesn't last, so the alert clears quickly. |
What we use right now is:
There's at least two ways we can improve this:
By using Prometheus sends alerts every time it evaluates the check and it's false, but once it clear the alert disappears:
|
I see no clear way to make such an alert last using Alertmanager configuration: So it seems the only way to make it last longer is doing some metric query wizardry: |
Also, there is another way, which is not great, but we could ask for another metric:
Not sure if this would be accepted by
|
I guess we never came to a conclusion on how to fix this. Seems like a new metric would make this much easier. |
What
validator_monitor_slashed
Why
The issue follow the incident of the 2024/01/31 on holesky where Validator were slashed due to BN and VC using the same validator keys.
The text was updated successfully, but these errors were encountered: