KubernetesHpaScaleMaximum #
HPA {{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }} has hit maximum number of desired pods
Alert Rule
alert: KubernetesHpaScaleMaximum
annotations:
description: |-
HPA {{ $labels.namespace }}/{{ $labels.horizontalpodautoscaler }} has hit maximum number of desired pods
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/kubestate-exporter/kuberneteshpascalemaximum/
summary: Kubernetes HPA scale maximum (instance {{ $labels.instance }})
expr: (kube_horizontalpodautoscaler_status_desired_replicas >= kube_horizontalpodautoscaler_spec_max_replicas)
and (kube_horizontalpodautoscaler_spec_max_replicas > 1) and (kube_horizontalpodautoscaler_spec_min_replicas
!= kube_horizontalpodautoscaler_spec_max_replicas)
for: 2m
labels:
severity: info
Meaning #
The KubernetesHpaScaleMaximum alert is triggered when a Horizontal Pod Autoscaler (HPA) has reached its maximum allowed number of replicas, and the minimum and maximum replica counts are not equal. This means that the HPA has scaled up to the maximum allowed number of pods, and further scaling up is not possible.
Impact #
When an HPA reaches its maximum scale, it may lead to:
- Increased latency or errors in the application due to insufficient resources
- Inability to handle increased traffic or demand, potentially leading to downtime or lost revenue
- Inefficient resource utilization, as the HPA is unable to scale down when not needed
Diagnosis #
To diagnose the issue, follow these steps:
- Check the HPA configuration: Verify that the
maxReplicas
andminReplicas
values are correctly set in the HPA configuration. - Check the current pod count: Verify the current number of pods running in the deployment/replicaset.
- Check the HPA status: Verify the current status of the HPA, including the
desiredReplicas
value. - Check the deployment/replicaset resource usage: Verify the current resource usage (e.g., CPU, memory) of the deployment/replicaset.
Mitigation #
To mitigate the issue, follow these steps:
- Increase the
maxReplicas
value: If the currentmaxReplicas
value is too low, consider increasing it to allow for further scaling. - Check for resource bottlenecks: Identify and address any resource bottlenecks (e.g., insufficient CPU, memory, or storage) that may be preventing the HPA from scaling further.
- Optimize deployment/replicaset configuration: Review and optimize the deployment/replicaset configuration to ensure efficient resource utilization.
- Consider cluster autoscaling: If the HPA is consistently reaching its maximum scale, consider enabling cluster autoscaling to dynamically adjust the cluster size based on demand.
For more information, refer to the Kubernetes HPA Scale Maximum Runbook.