IstioLowTotalRequestRate #
Global request rate in the service mesh is unusually low.
Alert Rule
alert: IstioLowTotalRequestRate
annotations:
description: |-
Global request rate in the service mesh is unusually low.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/istio-internal/istiolowtotalrequestrate/
summary: Istio low total request rate (instance {{ $labels.instance }})
expr: sum(rate(istio_requests_total{reporter="destination"}[5m])) < 100
for: 2m
labels:
severity: warning
Here is a runbook for the IstioLowTotalRequestRate alert:
Meaning #
The IstioLowTotalRequestRate alert is triggered when the total request rate in the service mesh falls below 100 requests per 5 minutes. This indicates a significant decrease in traffic flowing through the service mesh, which may be a sign of a problem with the application, infrastructure, or Istio configuration.
Impact #
A low total request rate can have several impacts on the system:
- Reduced application performance or availability
- Impact on business KPIs, such as revenue or user engagement
- Increased latency or errors for users
- Potential security risks if the decrease in traffic is due to a misconfigured firewall or network policy
Diagnosis #
To diagnose the cause of the low total request rate, follow these steps:
- Check the Istio dashboard for any errors or warnings related to request processing.
- Verify that the application and its dependencies are functioning correctly.
- Review the networking and firewall configurations to ensure they are not blocking traffic.
- Check the load balancer and ingress gateway configurations to ensure they are functioning correctly.
- Analyze the request traffic patterns to identify any unusual changes or trends.
Mitigation #
To mitigate the impact of the low total request rate, follow these steps:
- Investigate and resolve any errors or warnings related to request processing in Istio.
- Verify that the application and its dependencies are properly configured and scaled.
- Adjust the networking and firewall configurations to allow traffic to flow properly.
- Review and adjust the load balancer and ingress gateway configurations as needed.
- Consider scaling up or out to increase capacity and handle increased traffic.
Additional resources:
- For more information on troubleshooting Istio, refer to the Istio documentation: https://istio.io/latest/docs/ops/troubleshooting/
- For more information on Prometheus alerting, refer to the Prometheus documentation: https://prometheus.io/docs/alerting/latest/