Skip to content

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.
  • Labs icon Lab
  • A Cloud Guru
Google Cloud Platform icon
Labs

Configuring Prometheus Alertmanager for High Availability

Prometheus Alertmanager is a great way to handle your Prometheus alerts. However, a lone instance of Alertmanager can serve as a single point of failure if it goes down. Luckily, you can configure Alertmanager to run in a multi-instance cluster to provide failure resilience. In this hands-on lab, you will make an existing single-instance Alertmanager setup highly available by adding an additional instance.

Google Cloud Platform icon
Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 1h 0m
Published
Clock icon Jun 12, 2020

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Configure the Two Alertmanager Instances to Form a Cluster

    1. Log in to both the Prometheus Server and Alertmanager 2 server.

    2. On both servers, edit the Alertmanager unit file:

      sudo vi /etc/systemd/system/alertmanager.service
      
    3. Locate the ExecStart section. On each server, add the other server's private IP address using the cluster.peer flag.

    4. On the Prometheus Server:

      ExecStart=/usr/local/bin/alertmanager 
        --config.file /etc/alertmanager/alertmanager.yml 
        --storage.path /var/lib/alertmanager/ 
        --cluster.peer=10.0.1.102:9094
      
    5. On the Alertmanager 2 server:

      ExecStart=/usr/local/bin/alertmanager 
        --config.file /etc/alertmanager/alertmanager.yml 
        --storage.path /var/lib/alertmanager/ 
        --cluster.peer=10.0.1.101:9094
      
    6. On both servers, reload the unit file:

      sudo systemctl daemon-reload
      
    7. On the Prometheus Server, restart Alertmanager:

      sudo systemctl restart alertmanager
      
    8. On the Alertmanager 2 server, enable and start Alertmanager:

      sudo systemctl enable alertmanager
      
      sudo systemctl start alertmanager
      
    9. Test your cluster setup by creating a silence on one instance and verifying it appears on the other instance. Access both instances in a browser: http://<PUBLIC_IP>:9093

    10. On one instance, click Silences and create a new silence.

    11. Click Silences on the other instance and verify the silence you created appears.

  2. Challenge

    Configure Prometheus to Use Your Multi-Instance Alertmanager Setup

    1. On the Prometheus Server, edit the Prometheus configuration file:

      sudo vi /etc/prometheus/prometheus.yml
      
    2. Add the new Alertmanager (10.0.1.102:9093) to the list of Alertmanager targets:

      alerting:
        alertmanagers:
        - static_configs:
          - targets:
            - localhost:9093
            - 10.0.1.102:9093
      
    3. Restart Prometheus to reload the config:

      sudo systemctl restart prometheus
      
    4. Access the Prometheus server in a browser: http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090

    5. Click Status > Runtime & Build Information.

    6. Verify both of your Alertmanagers appear under the Alertmanagers section.

The Cloud Content team comprises subject matter experts hyper focused on services offered by the leading cloud vendors (AWS, GCP, and Azure), as well as cloud-related technologies such as Linux and DevOps. The team is thrilled to share their knowledge to help you build modern tech solutions from the ground up, secure and optimize your environments, and so much more!

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.

Start learning by doing today

View Plans