KubernetesApiServerLatency #
Kubernetes API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.
Alert Rule
alert: KubernetesApiServerLatency
annotations:
description: |-
Kubernetes API server has a 99th percentile latency of {{ $value }} seconds for {{ $labels.verb }} {{ $labels.resource }}.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/kubestate-exporter/kubernetesapiserverlatency/
summary: Kubernetes API server latency (instance {{ $labels.instance }})
expr: histogram_quantile(0.99, sum(rate(apiserver_request_duration_seconds_bucket{verb!~"(?:CONNECT|WATCHLIST|WATCH|PROXY)"}
[10m])) WITHOUT (subresource)) > 1
for: 2m
labels:
severity: warning
Here is a runbook for the KubernetesApiServerLatency alert:
Meaning #
The KubernetesApiServerLatency alert is triggered when the 99th percentile latency of the Kubernetes API server exceeds 1 second over a 10-minute period. This means that a significant portion of API requests are taking longer than expected to process, which can impact the performance and responsiveness of the Kubernetes cluster.
Impact #
A high latency in the Kubernetes API server can have several impacts on the cluster and its users:
- Increased response times for kubectl commands and other API requests
- Delays in pod scheduling and deployment
- Impact on the overall performance and responsiveness of the cluster
- Frustration and decreased productivity for cluster users
Diagnosis #
To diagnose the root cause of the high latency, follow these steps:
- Check the API server logs for errors or unusual patterns
- Verify that the API server is not overloaded or under-provisioned
- Check the latency of specific API requests using
kubectl get --raw /apis/<resource>
- Investigate any recent changes or deployments that may be causing the latency
- Use tools like
kubectl top
orkubectl describe
to monitor API server performance and resource utilization
Mitigation #
To mitigate the high latency, follow these steps:
- Investigate and resolve any underlying issues: Address any errors, overload, or under-provisioning of the API server
- Optimize API server configuration: Adjust API server settings to improve performance, such as increasing the number of workers or adjusting the request timeout
- Implement caching or latency-reducing mechanisms: Consider implementing caching mechanisms, such as Cluster Autoscaler, to reduce the load on the API server
- Monitor and alert on API server performance: Set up additional monitoring and alerting to quickly detect and respond to API server performance issues
- Consider scaling or upgrading the API server: If the latency persists, consider scaling or upgrading the API server to improve performance and responsiveness.