Prometheus does not just limit us to recording metrics. One of Prometheus’s core functionalities is the ability to define and route alerts to any alert management endpoint we define — or, in the case of this hands-on lab, Prometheus’s own sideproject, Alertmanager. Once we have our desired alerting thresholds defined, we need to set up our routes and receivers for the Alertmanager, ensuring our notifications are going to the right end user at the correct frequency and with the right information.
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Add a Rules File
Add a rules file configuration to the Prometheus config:
sudo $EDITOR /etc/prometheus/prometheus.yml rule_files: - "rules.yml"
Save and exit.
Create and open the
rules.yml
file:sudo $EDITOR /etc/prometheus/rules.yml
- Add an Alert to Track Uptime
Before creating the alert itself, create a recording of the desired metric:
groups: - name: uptime rules: - record: job:uptime:average:ft expr: avg without (instance) (up{job="forethought"})
Create an alert based on this recording:
groups: - name: uptime rules: - record: job:uptime:average:ft expr: avg without (instance) (up{job="forethought"}) - alert: ForethoughtApplicationDown expr: job:uptime:average:ft < .75 for: 30s labels: severity: page team: devops
Save and exit.
Restart Prometheus:
sudo systemctl restart prometheus sudo systemctl status prometheus
- Configure Alertmanager to Use an SMTP Smarthost
Open the Alertmanager configuration file:
sudo $EDITOR /etc/alertmanager/alertmanager.yml
Define the global settings:
global: resolve_timeout: 5m smtp_smarthost: 'localhost:25' smtp_from: 'prometheus'
- Set up Alertmanager Routing
Set up the backup route:
route: receiver: 'email_backup' group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1m
Set up the route for critical alerts:
route: receiver: 'email_backup' group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1m routes: - match: severity: 'critical' group_by: ['team'] receiver: 'email_pager'
Set up the route for team alerts:
route: receiver: 'email_backup' group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1m routes: - match: severity: 'critical' group_by: ['team'] receiver: 'email_pager' routes: - match: team: devops receiver: 'email_devops' ``
- Create the Needed Receivers
Create the receivers:
receivers: - name: 'email_backup' email_configs: - to: 'alerts@forethoughtapp.io' - name: 'email_pager' email_configs: - to: 'oncall@forethoughtapp.io' - name: 'email_devops' email_configs: - to: 'devops@forethoughtapp.io'
Save and exit.
Restart Alertmanager:
sudo systemctl restart alertmanager