RedisRejectedConnections #
Some connections to Redis has been rejected
Alert Rule
alert: RedisRejectedConnections
annotations:
description: |-
Some connections to Redis has been rejected
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/oliver006-redis-exporter/redisrejectedconnections/
summary: Redis rejected connections (instance {{ $labels.instance }})
expr: increase(redis_rejected_connections_total[1m]) > 0
for: 0m
labels:
severity: critical
Here is a sample runbook for the RedisRejectedConnections alert:
Meaning #
The RedisRejectedConnections alert indicates that Redis has rejected one or more incoming connections within the last minute. This could be due to various reasons such as Redis reaching its maximum connection limit, network issues, or configuration problems.
Impact #
If left unaddressed, rejected connections to Redis can lead to:
- Increased latency and timeouts for applications relying on Redis
- Data loss or inconsistencies due to failed writes or reads
- Cascading failures in dependent services or applications
- Decreased overall system performance and reliability
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Redis server logs for error messages related to connection rejections
- Verify the current connection count and maximum allowed connections using the
redis_info
metric - Investigate network connectivity issues between the Redis instance and connecting clients
- Review recent changes to Redis configuration or deployment
Mitigation #
To mitigate the issue, follow these steps:
- Temporary Fix: Increase the maximum allowed connections limit on the Redis instance to accommodate the current connection demand
- Root Cause Analysis: Identify and address the underlying cause of the connection rejections (e.g., network issues, configuration problems)
- Monitoring and Alerting: Implement additional monitoring and alerting to detect connection rejections and capacity issues proactively
- Long-term Solution: Consider scaling Redis instances, distributing load across multiple instances, or optimizing Redis configuration for better performance and reliability.
Remember to update the alert annotations with the root cause and mitigation steps taken to resolve the issue.