Using Alertmanager with Prometheus

30 minutes
  • 5 Learning Objectives

About this Hands-on Lab

Prometheus does not just limit us to recording metrics. One of Prometheus’s core functionalities is the ability to define and route alerts to any alert management endpoint we define — or, in the case of this Hands-On Lab, Prometheus’s own sideproject, Alertmanager.

Once we have our desired alerting thresholds defined, we need to set up our routes and receivers for the Alertmanager, ensuring our notifications are going to the right end user at the correct frequency and with the right information.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Add a Rules File
  1. Add a rules file configuration to the Prometheus config:

    $ sudo $EDITOR /etc/prometheus/prometheus.yml

    rule_files:

    • "rules.yml"

    Save and exit.

  2. Create and open the rules.yml file:

    $ sudo $EDITOR /etc/prometheus/rules.yml

Add an alert to track uptime
  1. Before creating the alert itself, create a recording of the desired metric:

    groups:

    • name: uptime
      rules:

      • record: job:uptime:average:ft
        expr: avg without (instance) (up{job="forethought"})
  2. Create the alert to see the application can gone down based on this recording:

    groups:

    • name: uptime
      rules:

      • record: job:uptime:average:ft
        expr: avg without (instance) (up{job="forethought"})
      • alert: ForethoughtApplicationDown
        expr: job:uptime:average:ft < .75
        for: 30s
        labels:
        severity: page
        team: devops

    Save and exit.

  3. Restart Prometheus:

    $ sudo systemctl restart prometheus
    $ sudo systemctl status prometheus

Configure Alertmanager to use an SMTP smarthost
  1. Open the Alertmanager configuration file:

    $ sudo $EDITOR /etc/alertmanager/alertmanager.yml

  2. Define the global settings:

    global:
    resolve_timeout: 5m
    smtp_smarthost: ‘localhost:25’
    smtp_from: ‘prometheus’

Set up Alertmanager routing
  1. Set up the backup route:

    route:
    receiver: ’email_backup’
    group_by: [‘alertname’]
    group_wait: 10s
    group_interval: 10s
    repeat_interval: 1m

  2. Set up the route for critical alerts:

    route:
    receiver: ’email_backup’
    group_by: [‘alertname’]
    group_wait: 10s
    group_interval: 10s
    repeat_interval: 1m
    routes:

    • match:
      severity: ‘critical’
      group_by: [‘team’]
      receiver: ’email_pager’
  3. Set up the route for team alerts:

    route:
    receiver: ’email_backup’
    group_by: [‘alertname’]
    group_wait: 10s
    group_interval: 10s
    repeat_interval: 1m
    routes:

    • match:
      severity: ‘critical’
      group_by: [‘team’]
      receiver: ’email_pager’
      routes:

      • match:
        team: devops
        receiver: ’email_devops’
Create the needed receivers
  1. Create the receivers:

    receivers:

    • name: ’email_backup’
      email_configs:

      • to: ‘alerts@forethoughtapp.io’
    • name: ’email_pager’
      email_configs:

      • to: ‘oncall@forethoughtapp.io’
    • name: ’email_devops’
      email_configs:

      • to: ‘devops@forethoughtapp.io’

    Save and exit.

  2. Restart Alertmanager:

    $ sudo systemctl restart alertmanager

Additional Resources

Now that your team has Prometheus set up and monitoring, you're tasked with setting up the alerting system so your monitoring stack can do more than just track data, but also alert the right people when something goes wrong.

Before you set up your Alertmanager routes, craft a test alert the triggers whenever more than 25% of our forethought endpoints are down. Give it the following labels:

  • severity: critical
  • team: devops

And ensure the for time is set to 1 minute. Save the rule to a file called rules.yml in /etc/prometheus.

Once the alert is set up, you need to set up a series of Alertmanager routes, that serve the following requirements:

  • Set the following global settings:
    • An SMTP smarthouse for the localhost
    • Set the email sender name to prometheus
  • An overall backup route that groups by alert name and uses the email_backup receiver
    • This route should be set up to use the default group_by, group_wait, and repeat_interval times
  • A route that matches any tickets with a severity of critical
    • Send these to the email_pager
    • Group by team
  • A route that matches for the team called devops
    • Send these to the email_devops receiver
  • Three email receivers:
    • email_backup that sends an email to alerts@forethoughtapp.io
    • email_pager that sends an email to oncall@forethroughtapp.io
    • email_devops that sends an email to devops@forethroughapp.io

Three servers are provided for this Hands-On Lab: One monitoring server, and two application servers, both of which are already being monitored by Prometheus. The monitoring server already has Prometheus, Alertmanager, and Grafana installed, with endpoints set up. You can use systemctl manage any of these services.

To test the alert, stop one of the application servers by running sudo docker stop ft-app.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?