Configuring Prometheus Alertmanager for High Availability

1 hour
  • 2 Learning Objectives

About this Hands-on Lab

Prometheus Alertmanager is a great way to handle your Prometheus alerts. However, a lone instance of Alertmanager can serve as a single point of failure if it goes down. Luckily, you can configure Alertmanager to run in a multi-instance cluster to provide failure resilience. In this hands-on lab, you will make an existing single-instance Alertmanager setup highly available by adding an additional instance.

Learning Objectives

Successfully complete this lab by achieving the following learning objectives:

Configure the Two Alertmanager Instances to Form a Cluster
  1. Log in to both the Prometheus Server and Alertmanager 2 server.

  2. On both servers, edit the Alertmanager unit file:

    sudo vi /etc/systemd/system/alertmanager.service
  3. Locate the ExecStart section. On each server, add the other server’s private IP address using the cluster.peer flag.

  4. On the Prometheus Server:

    ExecStart=/usr/local/bin/alertmanager 
      --config.file /etc/alertmanager/alertmanager.yml 
      --storage.path /var/lib/alertmanager/ 
      --cluster.peer=10.0.1.102:9094
  5. On the Alertmanager 2 server:

    ExecStart=/usr/local/bin/alertmanager 
      --config.file /etc/alertmanager/alertmanager.yml 
      --storage.path /var/lib/alertmanager/ 
      --cluster.peer=10.0.1.101:9094
  6. On both servers, reload the unit file:

    sudo systemctl daemon-reload
  7. On the Prometheus Server, restart Alertmanager:

    sudo systemctl restart alertmanager
  8. On the Alertmanager 2 server, enable and start Alertmanager:

    sudo systemctl enable alertmanager
    sudo systemctl start alertmanager
  9. Test your cluster setup by creating a silence on one instance and verifying it appears on the other instance. Access both instances in a browser: http://<PUBLIC_IP>:9093

  10. On one instance, click Silences and create a new silence.

  11. Click Silences on the other instance and verify the silence you created appears.

Configure Prometheus to Use Your Multi-Instance Alertmanager Setup
  1. On the Prometheus Server, edit the Prometheus configuration file:

    sudo vi /etc/prometheus/prometheus.yml
  2. Add the new Alertmanager (10.0.1.102:9093) to the list of Alertmanager targets:

    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          - localhost:9093
          - 10.0.1.102:9093
  3. Restart Prometheus to reload the config:

    sudo systemctl restart prometheus
  4. Access the Prometheus server in a browser: http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090

  5. Click Status > Runtime & Build Information.

  6. Verify both of your Alertmanagers appear under the Alertmanagers section.

Additional Resources

Your company, LimeDrop, is using Alertmanager to handle Prometheus alerts. Recently, there was an outage in part of the data center. The single Alertmanager instance was one of the affected servers, and the problem was not detected for a few hours because no one received an alert. Your task is to build a multi-instance configuration by adding an additional Alertmanager instance.

The Prometheus server has the first Alertmanager instance running on it. A server called Alertmanager 2 has already been set up. Alertmanager 2 will run the second Alertmanager instance. Alertmanager is installed there, but is not running.

Configure both Alertmanager instances to run as a two-instance cluster. Then, configure Prometheus server so it is able to send alerts to both instances as needed.

What are Hands-on Labs

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Sign In
Welcome Back!

Psst…this one if you’ve been moved to ACG!

Get Started
Who’s going to be learning?