HostUnusualNetworkThroughputOut #
Host network interfaces are probably sending too much data (> 100 MB/s)
Alert Rule
alert: HostUnusualNetworkThroughputOut
annotations:
description: |-
Host network interfaces are probably sending too much data (> 100 MB/s)
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/node-exporter/hostunusualnetworkthroughputout/
summary: Host unusual network throughput out (instance {{ $labels.instance }})
expr: (sum by (instance) (rate(node_network_transmit_bytes_total[2m])) / 1024 / 1024
> 100) * on(instance) group_left (nodename) node_uname_info{nodename=~".+"}
for: 5m
labels:
severity: warning
Here is a runbook for the Prometheus alert rule “HostUnusualNetworkThroughputOut”:
Meaning #
The “HostUnusualNetworkThroughputOut” alert is triggered when a host’s network interface is transmitting data at an unusually high rate, exceeding 100 MB/s. This could indicate a potential issue with the host or its network configuration.
Impact #
A high network throughput can have several consequences, including:
- Network congestion: High network traffic can lead to congestion, causing delays and packet loss.
- Increased CPU usage: Handling large amounts of network traffic can increase CPU usage, potentially impacting system performance.
- Security risks: Unusual network activity can be a sign of malicious activity, such as data exfiltration or denial-of-service attacks.
Diagnosis #
To diagnose the issue, follow these steps:
- Identify the affected host: Check the
instance
label in the alert to determine which host is experiencing the unusual network throughput. - Check network interface usage: Use tools like
top
orhtop
to monitor network interface usage on the affected host. - Investigate processes using high network bandwidth: Use tools like
netstat
orss
to identify processes that are using high amounts of network bandwidth. - Review system logs: Check system logs for any errors or signs of malicious activity.
Mitigation #
To mitigate the issue, follow these steps:
- Investigate and stop any malicious processes: If you identify any malicious processes, stop them immediately.
- Optimize system configuration: Review system configuration to ensure it is optimized for network performance.
- Implement network traffic filtering: Consider implementing network traffic filtering rules to block or rate-limit suspicious traffic.
- Monitor network activity: Continuously monitor network activity to ensure the issue is resolved and to detect any future anomalies.
Remember to update the runbook with specific steps and tools relevant to your environment.