RabbitmqTooManyReadyMessages #
RabbitMQ too many ready messages on {{ $labels.instace }}
Alert Rule
alert: RabbitmqTooManyReadyMessages
annotations:
description: |-
RabbitMQ too many ready messages on {{ $labels.instace }}
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/rabbitmq-exporter/rabbitmqtoomanyreadymessages/
summary: RabbitMQ too many ready messages (instance {{ $labels.instance }})
expr: sum(rabbitmq_queue_messages_ready) BY (queue) > 1000
for: 1m
labels:
severity: warning
Here is a sample runbook for the Prometheus alert rule RabbitmqTooManyReadyMessages
:
Meaning #
The RabbitmqTooManyReadyMessages
alert is triggered when the number of ready messages in a RabbitMQ queue exceeds 1000. This indicates that the queue is experiencing a high volume of messages that are ready to be consumed, but have not been processed yet.
Impact #
A high number of ready messages in a RabbitMQ queue can have significant impacts on the performance and reliability of the system. Some potential consequences include:
- Increased memory usage on the RabbitMQ node, leading to potential crashes or slow performance
- Delays in message processing, causing latency and affecting the overall throughput of the system
- Increased risk of message loss or corruption, leading to data integrity issues
Diagnosis #
To diagnose the root cause of the RabbitmqTooManyReadyMessages
alert, follow these steps:
- Check the RabbitMQ queue metrics to identify the specific queue experiencing the high volume of ready messages.
- Investigate the message production rate and consumption rate to determine if there is an imbalance.
- Review the application logs to identify any errors or issues that may be contributing to the buildup of ready messages.
- Check the RabbitMQ node resource utilization (e.g., CPU, memory, disk) to ensure it is not experiencing any resource constraints.
Mitigation #
To mitigate the RabbitmqTooManyReadyMessages
alert, take the following steps:
- Investigate and resolve any underlying application errors or issues that may be causing the buildup of ready messages.
- Adjust the message production rate or add additional consumer nodes to balance the message consumption rate.
- Implement message routing or filtering to reduce the volume of messages in the affected queue.
- Consider increasing the RabbitMQ node resources (e.g., adding more nodes, increasing memory) to handle the increased message volume.
- Monitor the queue metrics closely to ensure the issue is resolved and the queue is returning to a normal state.