ThanosSidecarNoConnectionToStartedPrometheus #
Thanos Sidecar {{$labels.instance}} is unhealthy.
Alert Rule
alert: ThanosSidecarNoConnectionToStartedPrometheus
annotations:
description: |-
Thanos Sidecar {{$labels.instance}} is unhealthy.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/thanos-sidecar/thanossidecarnoconnectiontostartedprometheus/
summary: Thanos Sidecar No Connection To Started Prometheus (instance {{ $labels.instance
}})
expr: thanos_sidecar_prometheus_up{job=~".*thanos-sidecar.*"} == 0 and on (namespace,
pod)prometheus_tsdb_data_replay_duration_seconds != 0
for: 5m
labels:
severity: critical
Here is a runbook for the ThanosSidecarNoConnectionToStartedPrometheus alert:
Meaning #
The ThanosSidecarNoConnectionToStartedPrometheus alert is triggered when a Thanos Sidecar instance is unable to connect to a started Prometheus instance. This alert is critical because it indicates a failure in the monitoring pipeline, which can lead to incomplete or inaccurate metrics.
Impact #
The impact of this alert is that metrics from the affected Prometheus instance will not be ingested into Thanos, resulting in incomplete or inaccurate monitoring data. This can lead to:
- Incomplete visibility into system performance and health
- Inaccurate alerting and reporting
- Delayed or missed detection of critical issues
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Thanos Sidecar logs for errors or warnings related to connecting to the Prometheus instance.
- Verify that the Prometheus instance is running and healthy.
- Check the network connectivity between the Thanos Sidecar and Prometheus instances.
- Verify that the Prometheus instance is correctly configured to allow connections from the Thanos Sidecar.
Mitigation #
To mitigate the issue, follow these steps:
- Restart the Thanos Sidecar instance to attempt to re-establish the connection to the Prometheus instance.
- Check and correct any configuration issues with the Prometheus instance or Thanos Sidecar.
- Verify that the network connectivity between the instances is stable and working as expected.
- If the issue persists, consider increasing the logging level of the Thanos Sidecar to gather more detailed information about the connection issue.
Additional resources:
- Refer to the Thanos Sidecar documentation for more information on configuring and troubleshooting connections to Prometheus instances.
- If you are unsure about the root cause of the issue or need further assistance, consult with your monitoring team or a Thanos expert.