OPNsenseWireGuardPeerStaleHandshake #
WireGuard peer {{ $labels.peer_name }} has not handshaked in > 5 minutes
Alert Rule
alert: OPNsenseWireGuardPeerStaleHandshake
annotations:
description: |-
WireGuard peer {{ $labels.peer_name }} has not handshaked in > 5 minutes
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/opnsense/opnsensewireguardpeerstalehandshake/
summary: OPNsense WireGuard peer stale handshake (instance {{ $labels.opnsense_instance
}})
expr: (time() - opnsense_wireguard_peer_last_handshake_seconds) > 300
for: 5m
labels:
severity: warning
Meaning #
The OPNsenseWireGuardPeerStaleHandshake alert is triggered when a WireGuard peer has not performed a handshake in over 5 minutes. This indicates a potential issue with the WireGuard connection, which could be caused by a variety of factors such as network connectivity problems, configuration errors, or peer downtime. The alert is classified as a warning, suggesting that the issue is not critical but should be investigated and resolved to prevent potential disruptions to the network.
Impact #
The impact of this alert is that the WireGuard peer is not able to establish a secure connection, which could lead to a range of problems including:
- Loss of connectivity to the peer
- Inability to access resources or services behind the peer
- Potential security vulnerabilities if the peer is not able to authenticate or encrypt traffic
- Downtime or disruptions to critical services or applications that rely on the WireGuard connection
Diagnosis #
To diagnose the issue, the following steps can be taken:
- Check the WireGuard peer status and configuration to ensure that it is correctly set up and enabled.
- Verify network connectivity to the peer and check for any firewalls or network devices that may be blocking the connection.
- Review the WireGuard logs to check for any error messages or hints about the cause of the issue.
- Check the system clocks on both the OPNsense device and the WireGuard peer to ensure that they are synchronized, as a clock skew can cause handshake issues.
- Attempt to manually initiate a handshake or restart the WireGuard service to see if the issue resolves itself.
Mitigation #
To mitigate the issue, the following steps can be taken:
- Investigate and resolve any underlying network connectivity issues that may be preventing the handshake.
- Verify and update the WireGuard configuration to ensure that it is correct and up-to-date.
- Restart the WireGuard service or reboot the peer device to reset the connection.
- Monitor the WireGuard logs and system metrics to detect any potential issues or errors that may be contributing to the problem.
- Consider implementing additional monitoring or alerting to detect similar issues in the future and prevent downtime or disruptions.