ElasticsearchHighQueryLatency #
The query latency on Elasticsearch cluster is higher than the threshold.
Alert Rule
alert: ElasticsearchHighQueryLatency
annotations:
description: |-
The query latency on Elasticsearch cluster is higher than the threshold.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/prometheus-community-elasticsearch-exporter/elasticsearchhighquerylatency/
summary: Elasticsearch High Query Latency (instance {{ $labels.instance }})
expr: elasticsearch_indices_search_fetch_time_seconds / elasticsearch_indices_search_fetch_total
> 1
for: 5m
labels:
severity: warning
Here is a runbook for the ElasticsearchHighQueryLatency alert:
Meaning #
The ElasticsearchHighQueryLatency alert is triggered when the average query latency on an Elasticsearch cluster exceeds 1 second. This is calculated by dividing the total search fetch time (in seconds) by the total number of search fetch operations.
Impact #
High query latency on Elasticsearch can have a significant impact on the performance and responsiveness of dependent applications and services. This can lead to:
- Slow query response times, affecting user experience
- Increased load on the Elasticsearch cluster, potentially leading to additional latency or even node failures
- Potential impact on business-critical operations, such as search, logging, and analytics
Diagnosis #
To diagnose the root cause of high query latency, follow these steps:
- Check the Elasticsearch cluster health: Verify that the cluster is healthy and all nodes are available.
- Investigate query patterns: Analyze query logs to identify any unusual or heavy query patterns that may be contributing to the latency.
- Check indexing and shard configuration: Verify that indexing and shard configurations are optimal for the workload.
- Review cluster resource utilization: Monitor CPU, memory, and disk usage to identify any resource bottlenecks.
- Check for any ongoing maintenance or upgrades: Verify that there are no ongoing maintenance or upgrades that may be affecting query performance.
Mitigation #
To mitigate high query latency, follow these steps:
- Optimize indexing and shard configuration: Review and adjust indexing and shard configurations to improve query performance.
- Adjust query patterns: Optimize query patterns to reduce the load on the Elasticsearch cluster.
- Add more resources: Consider adding more resources (e.g., nodes, CPU, memory) to the Elasticsearch cluster to improve query performance.
- Implement query caching: Implement query caching to reduce the load on the Elasticsearch cluster.
- Consider upgrading Elasticsearch: If the cluster is running an older version of Elasticsearch, consider upgrading to a newer version that may have performance improvements.
Remember to investigate and address the root cause of the high query latency to prevent similar issues in the future.