FreeRADIUSDown #
The FreeRADIUS server is not reachable. Please check the server immediately.
Alert Rule
alert: FreeRADIUSDown
annotations:
description: The FreeRADIUS server is not reachable. Please check the server immediately.
runbook: https://srerun.github.io/prometheus-alerts/runbooks/freeradius-exporter/freeradiusdown/
summary: FreeRADIUS is down
expr: freeradius_up == 0
for: 1m
labels:
severity: critical
Here is a runbook for the FreeRADIUSDown alert:
Meaning #
The FreeRADIUSDown alert indicates that the FreeRADIUS server is not reachable and has been down for at least 1 minute. This alert is critical because it affects the ability to authenticate and authorize users, which can impact the availability of critical services.
Impact #
The impact of this alert is high, as it can cause:
- Disruption to user authentication and authorization
- Inability to access critical services and applications
- Potential security risks due to unauthorized access
- Increased support tickets and user frustration
Diagnosis #
To diagnose the issue, follow these steps:
- Check the FreeRADIUS server logs for errors or anomalies
- Verify that the server is running and that there are no network connectivity issues
- Check the system resources (CPU, memory, disk space) to ensure they are within normal limits
- Verify that the FreeRADIUS exporter is properly configured and running
Mitigation #
To mitigate the issue, follow these steps:
- Restart the FreeRADIUS server and verify that it is running properly
- Check and resolve any network connectivity issues
- Check and resolve any system resource issues (e.g., clear disk space, restart services)
- Verify that the FreeRADIUS exporter is properly configured and running
- Perform a rolling restart of the FreeRADIUS server to ensure that it is running with the correct configuration
- Monitor the server for any further issues and take corrective action as needed
Note: It is essential to resolve this issue as quickly as possible to minimize the impact on users and services.