ContainerHighCpuUtilization #

Container CPU utilization is above 80%

Alert Rule

alert: ContainerHighCpuUtilization
annotations:
  description: |-
    Container CPU utilization is above 80%
      VALUE = {{ $value }}
      LABELS = {{ $labels }}    
  runbook: https://srerun.github.io/prometheus-alerts/runbooks/google-cadvisor/containerhighcpuutilization/
  summary: Container High CPU utilization (instance {{ $labels.instance }})
expr: (sum(rate(container_cpu_usage_seconds_total{container!=&#34;&#34;}[5m])) by (pod, container)
  / sum(container_spec_cpu_quota{container!=&#34;&#34;}/container_spec_cpu_period{container!=&#34;&#34;})
  by (pod, container) * 100) &gt; 80
for: 2m
labels:
  severity: warning

Here is a runbook for the Prometheus alert rule ContainerHighCpuUtilization:

Meaning #

This alert is triggered when a container’s CPU utilization exceeds 80% over a 5-minute period. This can indicate that the container is experiencing high CPU usage, which may lead to performance issues, slower response times, or even crashes.

Impact #

High CPU utilization can have several consequences, including:

Slower response times: High CPU usage can cause the container to slow down, leading to delayed responses to requests.
Increased latency: High CPU utilization can increase the time it takes for the container to process requests, leading to increased latency.
Resource starvation: High CPU utilization can starve other containers or processes of CPU resources, leading to performance issues or failures.
Increased risk of crashes: Prolonged high CPU utilization can cause the container to crash or become unresponsive.

Diagnosis #

To diagnose the root cause of high CPU utilization, follow these steps:

Identify the affected container: Check the pod and container labels in the alert to determine which container is experiencing high CPU utilization.
Check container logs: Review the container logs to see if there are any error messages or indications of high CPU usage.
Check container resource usage: Use tools like kubectl top or docker stats to check the container’s CPU usage, memory usage, and other resource utilization.
Analyze container configuration: Review the container configuration to ensure that it is properly configured and optimized for performance.
Check system resource usage: Verify that the underlying system has sufficient resources (e.g., CPU, memory) to support the container’s workload.

Mitigation #

To mitigate high CPU utilization, follow these steps:

Optimize container configuration: Review and optimize the container configuration to ensure it is properly configured for performance.
Increase container resources: If necessary, increase the container’s CPU or memory resources to ensure it has sufficient resources to handle the workload.
Implement rate limiting: Implement rate limiting or throttling to prevent excessive requests that can cause high CPU utilization.
Scale container instances: If the container is experiencing high CPU utilization due to high traffic, consider scaling the number of container instances to distribute the workload.
Investigate underlying system issues: If high CPU utilization persists, investigate underlying system issues, such as hardware failures or system configuration problems.

Remember to take a proactive approach to monitoring and addressing high CPU utilization to prevent performance issues and ensure the reliability of your system.