ContainerVolumeUsage #
Container Volume usage is above 80%
Alert Rule
alert: ContainerVolumeUsage
annotations:
description: |-
Container Volume usage is above 80%
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/google-cadvisor/containervolumeusage/
summary: Container Volume usage (instance {{ $labels.instance }})
expr: (1 - (sum(container_fs_inodes_free{name!=""}) BY (instance) / sum(container_fs_inodes_total)
BY (instance))) * 100 > 80
for: 2m
labels:
severity: warning
Meaning #
The ContainerVolumeUsage alert is triggered when the available inodes on a container’s filesystem falls below 20%, indicating that the container’s volume usage is above 80%. This alert is critical because it can lead to container crashes, data loss, and performance degradation.
Impact #
If left unaddressed, high container volume usage can cause:
- Container crashes and failures
- Data loss and corruption
- Performance degradation and slow response times
- Increased risk of security breaches due to overflowing logs and temporary files
Diagnosis #
To diagnose the issue, follow these steps:
- Check the container’s filesystem usage using commands like
docker exec -it <container_name> df -h
orkubectl exec -it <pod_name> -c <container_name> df -h
- Verify that the container is not writing excessive amounts of data to its filesystem
- Check for any unnecessary files or directories that can be safely removed
- Investigate if the container’s volume is properly configured and resized
- Review the container’s logging configuration to ensure that logs are not overflowing and filling up the filesystem
Mitigation #
To mitigate the issue, follow these steps:
- Identify and remove any unnecessary files or directories taking up space on the container’s filesystem
- Resize the container’s volume to provide more storage capacity
- Implement log rotation and compression to prevent log files from growing indefinitely
- Configure the container to write data to an external storage volume or a cloud-based object store
- Consider implementing quotas or limits on the container’s filesystem usage to prevent future instances of high volume usage
Note: For more detailed steps and recommendations, refer to the runbook provided in the alert annotation.