RabbitmqOutOfMemory #
Memory available for RabbmitMQ is low (< 10%)
Alert Rule
alert: RabbitmqOutOfMemory
annotations:
description: |-
Memory available for RabbmitMQ is low (< 10%)
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/kbudde-rabbitmq-exporter/rabbitmqoutofmemory/
summary: RabbitMQ out of memory (instance {{ $labels.instance }})
expr: rabbitmq_node_mem_used / rabbitmq_node_mem_limit * 100 > 90
for: 2m
labels:
severity: warning
Here is the runbook for the RabbitMQ Out of Memory alert:
Meaning #
The RabbitMQ Out of Memory alert is triggered when the memory usage of a RabbitMQ node exceeds 90% of its available memory. This indicates that the RabbitMQ instance is at risk of running out of memory, which can lead to performance issues, crashes, and even data loss.
Impact #
If left unaddressed, high memory usage can cause:
- RabbitMQ performance degradation
- Increased likelihood of crashes and restarts
- Message loss and duplication
- Queues and exchanges becoming unavailable
- Impact on dependent applications and services
Diagnosis #
To diagnose the issue, follow these steps:
- Check the RabbitMQ web interface or the Prometheus dashboard to identify the node(s) experiencing high memory usage.
- Investigate recent changes to the RabbitMQ configuration, such as queue or exchange settings, that may be contributing to the increased memory usage.
- Verify that the RabbitMQ instance has sufficient available memory and that the system is not experiencing resource constraints.
- Review the RabbitMQ logs for errors or warnings related to memory usage.
- Check for any abnormal message throughput or queue growth.
Mitigation #
To mitigate the issue, follow these steps:
- Reduce message throughput: Temporarily reduce the number of messages being published to RabbitMQ to alleviate memory pressure.
- Increase available memory: Consider increasing the available memory for the RabbitMQ instance, either by adjusting the node’s resource allocation or by adding more nodes to the cluster.
- Optimize RabbitMQ configuration: Review and optimize RabbitMQ configuration settings, such as queue and exchange settings, to reduce memory usage.
- Purge unnecessary data: Remove unnecessary messages, queues, or exchanges to free up memory.
- Restart RabbitMQ: If necessary, restart the RabbitMQ instance to clear out any cached data and reset memory usage.
Remember to monitor the situation closely and adjust the mitigation steps as needed to prevent further memory-related issues.