-
Notifications
You must be signed in to change notification settings - Fork 9
Monitoring and alerting overview
Cintia Del Rio edited this page Jan 28, 2025
·
1 revision
Our main internal monitoring is based on Datadog. This will have all the details when machines are down, running out of disk, memory or CPU. It will be sent by infrastructure email.
Pingdom have HTTP checks on public services. Pingdom will also create tickets in helpdesk.
We do have accounts in pageduty, which will alert whoever is on-call. Pageduty can be triggered by critical tickets on helpdesk, pingdom alerts.
We also have a dashboard with the status of our infrastructure, with data coming from pingdom.
Read this before updating this wiki.