NginxLatencyHigh #
Nginx p99 latency is higher than 3 seconds
Alert Rule
alert: NginxLatencyHigh
annotations:
description: |-
Nginx p99 latency is higher than 3 seconds
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/knyar-nginx-exporter/nginxlatencyhigh/
summary: Nginx latency high (instance {{ $labels.instance }})
expr: histogram_quantile(0.99, sum(rate(nginx_http_request_duration_seconds_bucket[2m]))
by (host, node, le)) > 3
for: 2m
labels:
severity: warning
Here is a runbook for the Prometheus alert rule NginxLatencyHigh
:
Meaning #
The NginxLatencyHigh
alert is triggered when the 99th percentile of Nginx request latency exceeds 3 seconds over a 2-minute window. This indicates that a significant number of requests are experiencing high latency, which can impact the responsiveness and overall user experience of the application.
Impact #
High Nginx latency can have several negative impacts:
- Slow response times can lead to frustrated users and decreased engagement.
- Increased latency can cause requests to timeout, resulting in errors and failed transactions.
- High latency can also increase the load on Nginx and other upstream services, leading to potential resource exhaustion and further performance degradation.
Diagnosis #
To diagnose the root cause of high Nginx latency, follow these steps:
- Check the Nginx error logs for any errors or warnings that may indicate the source of the latency.
- Verify that the Nginx configuration is optimal and tuned for performance.
- Investigate the performance of upstream services and databases to ensure they are not contributing to the latency.
- Check the system metrics (e.g., CPU, memory, disk usage) to ensure that the Nginx server or underlying infrastructure is not resource-constrained.
- Analyze the request patterns and traffic trends to identify any unusual or anomalous behavior.
Mitigation #
To mitigate high Nginx latency, follow these steps:
- Optimize Nginx configuration: Review and optimize the Nginx configuration to ensure it is optimized for performance. Consider tuning parameters such as worker processes, keepalive timeouts, and buffer sizes.
- Scale Nginx horizontally: If the Nginx server is resource-constrained, consider scaling out to additional instances to distribute the load.
- Optimize upstream services: Verify that upstream services and databases are optimized for performance and can handle the request load.
- Implement caching: Consider implementing caching mechanisms to reduce the load on Nginx and upstream services.
- Analyze and optimize request patterns: Identify and optimize request patterns to reduce latency and improve overall performance.
Remember to investigate and address the root cause of the high latency to prevent similar issues in the future.