Prometheus Alertmanager is a great way to handle your Prometheus alerts. However, a lone instance of Alertmanager can serve as a single point of failure if it goes down. Luckily, you can configure Alertmanager to run in a multi-instance cluster to provide failure resilience. In this hands-on lab, you will make an existing single-instance Alertmanager setup highly available by adding an additional instance.
Learning Objectives
Successfully complete this lab by achieving the following learning objectives:
- Configure the Two Alertmanager Instances to Form a Cluster
Log in to both the
Prometheus Server
andAlertmanager 2
server.On both servers, edit the Alertmanager unit file:
sudo vi /etc/systemd/system/alertmanager.service
Locate the
ExecStart
section. On each server, add the other server’s private IP address using thecluster.peer
flag.On the
Prometheus Server
:ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/alertmanager.yml --storage.path /var/lib/alertmanager/ --cluster.peer=10.0.1.102:9094
On the
Alertmanager 2
server:ExecStart=/usr/local/bin/alertmanager --config.file /etc/alertmanager/alertmanager.yml --storage.path /var/lib/alertmanager/ --cluster.peer=10.0.1.101:9094
On both servers, reload the unit file:
sudo systemctl daemon-reload
On the
Prometheus Server
, restart Alertmanager:sudo systemctl restart alertmanager
On the
Alertmanager 2
server, enable and start Alertmanager:sudo systemctl enable alertmanager
sudo systemctl start alertmanager
Test your cluster setup by creating a silence on one instance and verifying it appears on the other instance. Access both instances in a browser:
http://<PUBLIC_IP>:9093
On one instance, click Silences and create a new silence.
Click Silences on the other instance and verify the silence you created appears.
- Configure Prometheus to Use Your Multi-Instance Alertmanager Setup
On the
Prometheus Server
, edit the Prometheus configuration file:sudo vi /etc/prometheus/prometheus.yml
Add the new Alertmanager (
10.0.1.102:9093
) to the list of Alertmanager targets:alerting: alertmanagers: - static_configs: - targets: - localhost:9093 - 10.0.1.102:9093
Restart Prometheus to reload the config:
sudo systemctl restart prometheus
Access the Prometheus server in a browser:
http://<PROMETHEUS_SERVER_PUBLIC_IP>:9090
Click Status > Runtime & Build Information.
Verify both of your Alertmanagers appear under the Alertmanagers section.