Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide metrics & status on the scionlab website #301

Open
cmeury opened this issue Sep 1, 2020 · 2 comments
Open

Provide metrics & status on the scionlab website #301

cmeury opened this issue Sep 1, 2020 · 2 comments
Labels
SRE Deployment&operations issues

Comments

@cmeury
Copy link
Contributor

cmeury commented Sep 1, 2020

Goals

  • Provide insight into activity of the network for passers-by
  • Operational status for more involved users

Ideas

  • Bandwidth usage at individual APs
  • Overall traffic on the network
  • Number of ASes, infrastructure ASes, hosts, user ASes (active/total)
  • Show overall health of the network, i.e. which infrastructure ASes are currently up/down etc. (maybe color-coding a version of the SCIONLab topology figure)
  • Current commit SHA1 for SCIONLab, and date
  • Status of the attachment points
  • Status of the provided services (bwtestserver, echoserver)

Examples

@cmeury
Copy link
Contributor Author

cmeury commented Sep 8, 2020

Quick research done:

@AnotherKamila
Copy link
Contributor

AnotherKamila commented Sep 8, 2020

This is not entirely on-topic, because it is about alerts rather than metrics, but a long long time ago me and a certain awesome person hacked together evilham/prometheus-adlermanager. The interesting thing there is the whitelisting by labels, so that we can selectively only show some data to the public instead of exposing potentially sensitive information. Maybe that code could be useful? That way we might be able to create e.g. a "public" Grafana instance that through an Adlermanager-like proxy only has access to whitelisted metrics, or something like that.

EDIT: Um actually this was so long ago I can't find the relevant code XD But anyway, the point was that we might want to consider proxying Prometheus to only let the public see whitelisted metrics, and then use Prometheus-compatible tools to display/query/whatever.

(Also, but this is completely OT: do we want to deploy Adlermanager as a SCIONLab status page? The code was a result of a 48h hackathon, so it needs some love, but it could be in better shape with a day or two of work.)

@cmeury cmeury added the SRE Deployment&operations issues label Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SRE Deployment&operations issues
Projects
None yet
Development

No branches or pull requests

2 participants