ProxmoxBackupHostHighCpuLoad #
CPU load is > 80%
Alert Rule
alert: ProxmoxBackupHostHighCpuLoad
annotations:
description: |-
CPU load is > 80%
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/pbs-exporter/proxmoxbackuphosthighcpuload/
summary: Host high CPU load (id {{ $labels.id }})
expr: avg_over_time(pbs_host_cpu_usage[2m]) > 0.8
for: 10m
labels:
severity: warning
Meaning #
The ProxmoxBackupHostHighCpuLoad alert is triggered when the average CPU usage of a Proxmox backup host over a 2-minute period exceeds 80%. This indicates that the system is experiencing high CPU load, which can lead to performance issues and potential disruptions to backup operations.
Impact #
The impact of this alert is high, as high CPU load can cause:
- Slow backup performance
- Increased risk of backup failures
- Potential data loss or corruption
- Inability to meet backup windows, leading to compliance issues
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Proxmox backup host’s resource utilization, including CPU, memory, and disk usage.
- Investigate any recent changes to the backup configuration, such as new backup jobs or increased data volume.
- Review system logs for any error messages or warning signs of hardware failure.
- Verify that the backup host’s resources are sufficient to handle the current workload.
Mitigation #
To mitigate the issue, follow these steps:
- Check for any resource-intensive processes or tasks running on the backup host and terminate or reschedule them as necessary.
- Consider adding more resources (e.g., CPU, memory, or disk space) to the backup host to handle the increased workload.
- Optimize backup jobs to reduce the impact on the backup host.
- Consider implementing redundancy or load balancing to distribute the backup workload across multiple hosts.
Additionally, refer to the runbook for more detailed guidance and best practices for resolving this alert.