ZookeeperTooManyLeaders #
Zookeeper cluster has too many nodes marked as leader
Alert Rule
alert: ZookeeperTooManyLeaders
annotations:
description: |-
Zookeeper cluster has too many nodes marked as leader
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/dabealu-zookeeper-exporter/zookeepertoomanyleaders/
summary: Zookeeper Too Many Leaders (instance {{ $labels.instance }})
expr: sum(zk_server_leader) > 1
for: 0m
labels:
severity: critical
Here is the runbook for the Prometheus alert rule “ZookeeperTooManyLeaders”:
Meaning #
This alert indicates that the Zookeeper cluster has more than one node marked as the leader. In a healthy Zookeeper cluster, there should only be one leader node. Having multiple leader nodes can cause inconsistencies and errors in the cluster.
Impact #
Having multiple leader nodes in a Zookeeper cluster can lead to:
- Inconsistent data across the cluster
- Errors in client applications that rely on Zookeeper
- Unstable cluster behavior
- Potential data loss or corruption
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Zookeeper cluster status using the Zookeeper command-line tool or a monitoring tool like Zookeeper Exporter.
- Verify that the cluster has more than one node marked as the leader.
- Check the Zookeeper server logs for any errors or warnings related to leadership elections or node failures.
- Check the network connectivity between the Zookeeper nodes to ensure that there are no issues with communication.
Mitigation #
To mitigate the issue, follow these steps:
- Identify the cause of the multiple leader nodes, such as a node failure or network partition.
- If a node failure is suspected, try to restart the failed node or replace it with a new one.
- If a network partition is suspected, try to resolve the network connectivity issue.
- If the issue persists, consider restarting the entire Zookeeper cluster to ensure a clean leadership election.
- Monitor the cluster closely to ensure that the issue is resolved and the cluster is stable.
Remember to always exercise caution when making changes to a Zookeeper cluster, as it can affect the stability of the entire system.