RabbitmqFileDescriptorsUsage #
A node use more than 90% of file descriptors
Alert Rule
alert: RabbitmqFileDescriptorsUsage
annotations:
description: |-
A node use more than 90% of file descriptors
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/rabbitmq-exporter/rabbitmqfiledescriptorsusage/
summary: RabbitMQ file descriptors usage (instance {{ $labels.instance }})
expr: rabbitmq_process_open_fds / rabbitmq_process_max_fds * 100 > 90
for: 2m
labels:
severity: warning
Here is a sample runbook for the RabbitmqFileDescriptorsUsage
alert:
Meaning #
The RabbitmqFileDescriptorsUsage
alert is triggered when a RabbitMQ node is using more than 90% of its available file descriptors. File descriptors are a finite system resource that allows a process to access files, sockets, and other system resources. If a RabbitMQ node exhausts its available file descriptors, it may become unable to accept new connections, leading to performance issues and potential downtime.
Impact #
If left unchecked, high file descriptor usage can cause:
- Connection failures: New connections to RabbitMQ may be refused, leading to application errors and downtime.
- Performance issues: RabbitMQ may experience increased latency, decreased throughput, and reduced overall performance.
- Instability: Prolonged high file descriptor usage can lead to RabbitMQ node crashes or instability.
Diagnosis #
To diagnose the issue, follow these steps:
- Check the RabbitMQ node’s file descriptor usage: Use the
rabbitmqctl
command to check the current file descriptor usage:rabbitmqctl status | grep "file descriptors"
- Identify the cause of high file descriptor usage: Investigate recent changes to the RabbitMQ configuration, plugin installations, or application behavior that may be contributing to the high usage.
- Verify RabbitMQ node configuration: Check the RabbitMQ configuration files (
rabbitmq.config
andadvanced.config
) to ensure that thevm_memory_high_watermark
andvm_memory_limit
settings are properly configured.
Mitigation #
To mitigate the issue, follow these steps:
- Increase the maximum number of file descriptors: Update the RabbitMQ node’s
ulimit
settings to increase the maximum number of file descriptors available. Consult your system administrator for guidance. - Optimize RabbitMQ configuration: Review and optimize the RabbitMQ configuration to reduce file descriptor usage. Consider adjusting settings such as
vm_memory_high_watermark
andvm_memory_limit
. - Identify and fix application issues: Investigate and resolve any application issues that may be contributing to high file descriptor usage, such as inefficient connection management or resource leaks.
- Monitor file descriptor usage: Implement regular monitoring of file descriptor usage to identify potential issues before they become critical.
- Consider upgrading RabbitMQ: If you are running an older version of RabbitMQ, consider upgrading to the latest version, which may include improvements to file descriptor management.