IstioLatency99Percentile #
Istio 1% slowest requests are longer than 1000ms.
Alert Rule
alert: IstioLatency99Percentile
annotations:
description: |-
Istio 1% slowest requests are longer than 1000ms.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/istio-internal/istiolatency99percentile/
summary: Istio latency 99 percentile (instance {{ $labels.instance }})
expr: histogram_quantile(0.99, sum(rate(istio_request_duration_milliseconds_bucket[1m]))
by (destination_canonical_service, destination_workload_namespace, source_canonical_service,
source_workload_namespace, le)) > 1000
for: 1m
labels:
severity: warning
Here is a runbook for the IstioLatency99Percentile alert rule:
Meaning #
The IstioLatency99Percentile alert is triggered when the 99th percentile of Istio request duration exceeds 1000ms for a given service and namespace. This indicates that 1% of requests are experiencing high latency, which can impact the performance and responsiveness of the application.
Impact #
- Slow request processing can lead to a poor user experience, decreased system throughput, and increased error rates.
- High latency can also cause cascading failures, as dependent services may timeout or become unavailable.
- Prolonged latency issues can result in revenue loss, customer dissatisfaction, and damage to the organization’s reputation.
Diagnosis #
To diagnose the root cause of high latency, follow these steps:
- Check request traffic patterns: Analyze the request rate and distribution to identify any sudden changes or spikes.
- Investigate service dependencies: Verify that dependent services are operating within expected latency bounds.
- Review Istio configuration: Ensure that Istio is properly configured, and that service mesh metrics are being collected correctly.
- Examine pod and container logs: Review logs for errors, warnings, or other indicators of issues that may be contributing to high latency.
- Check for resource constraints: Verify that pods have sufficient resources (CPU, memory, etc.) to handle incoming requests.
Mitigation #
To mitigate high latency, take the following steps:
- Scale out services: Temporarily increase the number of replicas for the affected service to handle the increased request volume.
- Optimize service configuration: Review and optimize service configuration, such as timeouts, retries, and circuit breakers.
- Improve resource allocation: Ensure that pods have sufficient resources to handle incoming requests.
- Apply caching or content compression: Implement caching or content compression to reduce the load on services and improve response times.
- Investigate root cause: Identify and address the underlying cause of high latency, which may require fixing application code, database queries, or other system components.