ElasticsearchHealthyNodes #
Missing node in Elasticsearch cluster
Alert Rule
alert: ElasticsearchHealthyNodes
annotations:
description: |-
Missing node in Elasticsearch cluster
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/prometheus-community-elasticsearch-exporter/elasticsearchhealthynodes/
summary: Elasticsearch Healthy Nodes (instance {{ $labels.instance }})
expr: elasticsearch_cluster_health_number_of_nodes < 3
for: 0m
labels:
severity: critical
Meaning #
The ElasticsearchHealthyNodes alert is triggered when the number of healthy nodes in an Elasticsearch cluster falls below 3. This indicates that the cluster is not in a healthy state and may be experiencing issues with data replication, indexing, or search functionality.
Impact #
The impact of this alert is critical, as it may lead to:
- Data loss or inconsistencies
- Search functionality degradation
- Increased latency or timeouts
- Potential cluster instability or even complete cluster failure
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Elasticsearch cluster health using the Elasticsearch API or a tool like Kibana.
- Verify that the number of healthy nodes is indeed less than 3.
- Investigate the nodes that are not healthy and check their logs for errors or issues.
- Check the network connectivity between nodes and ensure that they can communicate with each other.
- Review the Elasticsearch configuration and cluster settings to ensure they are correct and up-to-date.
Mitigation #
To mitigate this issue, follow these steps:
- Identify the unhealthy nodes and investigate the root cause of their issues.
- Bring the unhealthy nodes back online or replace them if necessary.
- Ensure that the cluster is properly configured and that all nodes are properly connected.
- Monitor the cluster health and node status to ensure that the issue does not reoccur.
- Consider increasing the number of nodes in the cluster to improve resilience and redundancy.
For more detailed instructions and troubleshooting steps, refer to the runbook located at https://github.com/srerun/prometheus-alerts/blob/main/content/runbooks/prometheus-community-elasticsearch-exporter/ElasticsearchHealthyNodes.md.