ProxmoxCPUAllocationHigh #
It is recommended to keep more of your node’s CPU unallocated for use by PVE and other server applications your Proxmox node runs
Alert Rule
alert: ProxmoxCPUAllocationHigh
annotations:
description: It is recommended to keep more of your node's CPU unallocated for use
by PVE and other server applications your Proxmox node runs
runbook: https://srerun.github.io/prometheus-alerts/runbooks/proxmox-exporter/proxmoxcpuallocationhigh/
summary: Proxmox node {{ $labels.node }} has {{ $value }}% of its CPU allocated
to guests
expr: |
100 * (proxmox_node_cpus_allocated / proxmox_node_cpus_total) > 90
for: 5m
labels:
severity: critical
Meaning #
The ProxmoxCPUAllocationHigh alert is triggered when the percentage of allocated CPU resources on a Proxmox node exceeds a configured threshold (defaulting to 90%) for a duration of 5 minutes or more. This alert indicates that the node is heavily utilized and may not have sufficient CPU resources available for the Proxmox Virtual Environment (PVE) and other server applications running on the node.
Impact #
If left unaddressed, high CPU allocation can lead to:
- Performance degradation of virtual machines and server applications
- Increased latency and response times
- Potential for node crashes or instability
- Inability to allocate sufficient resources to new virtual machines or applications
- Negative impact on overall system reliability and availability
Diagnosis #
To diagnose the root cause of the high CPU allocation, follow these steps:
- Log in to the Proxmox web interface and navigate to the node exhibiting high CPU allocation.
- Check the node’s resource utilization (CPU, memory, and disk) to identify any bottlenecks or resource contention.
- Examine the list of running virtual machines and their resource allocations to identify potential culprits.
- Verify that the node is not experiencing any hardware or software issues that could be contributing to the high CPU allocation.
- Review system logs for errors or warnings related to resource allocation or node performance.
Mitigation #
To mitigate the high CPU allocation, consider the following steps:
- Rebalance virtual machine resources: Re-allocate CPU resources from heavily utilized virtual machines to less utilized ones, ensuring a more balanced distribution of resources.
- Migrate virtual machines: Migrate virtual machines to other nodes with available resources, if possible, to alleviate the load on the affected node.
- Right-size virtual machines: Review virtual machine configurations to ensure they are correctly sized for their workloads, and adjust as necessary to reduce CPU allocation.
- Implement resource limits: Set resource limits for virtual machines to prevent over-allocation and ensure sufficient resources are available for the node and other applications.
- Upgrade node resources: Consider upgrading the node’s hardware resources (e.g., adding more CPU cores or upgrading to faster processors) to increase available capacity.
- Investigate node optimization: Investigate opportunities to optimize node performance, such as tuning PVE settings or adjusting system configurations.