ThanosCompactorMultipleRunning #
No more than one Thanos Compact instance should be running at once. There are {{$value}} instances running.
Alert Rule
alert: ThanosCompactorMultipleRunning
annotations:
description: |-
No more than one Thanos Compact instance should be running at once. There are {{$value}} instances running.
VALUE = {{ $value }}
LABELS = {{ $labels }}
runbook: https://srerun.github.io/prometheus-alerts/runbooks/thanos-compactor/thanoscompactormultiplerunning/
summary: Thanos Compactor Multiple Running (instance {{ $labels.instance }})
expr: sum by (job) (up{job=~".*thanos-compact.*"}) > 1
for: 5m
labels:
severity: warning
Here is a sample runbook for the ThanosCompactorMultipleRunning alert:
Meaning #
The ThanosCompactorMultipleRunning alert is triggered when multiple Thanos Compactor instances are running simultaneously. This is not expected behavior, as Thanos Compactor is designed to run as a single instance. Running multiple instances can lead to inconsistencies and errors in the compacted data.
Impact #
The impact of running multiple Thanos Compactor instances can be significant, leading to:
- Data inconsistencies and errors
- Increased resource usage and load on the system
- Potential data loss or corruption
- Difficulty in troubleshooting and debugging issues
Diagnosis #
To diagnose the issue, follow these steps:
- Check the Prometheus metrics to verify the number of running Thanos Compactor instances.
- Check the Thanos Compactor logs to see if there are any errors or warnings related to multiple instances running.
- Check the system configuration and deployment scripts to ensure that only one instance of Thanos Compactor is intended to be running.
- Check for any automation or scheduling issues that may be causing multiple instances to start unintentionally.
Mitigation #
To mitigate the issue, follow these steps:
- Immediately stop all but one of the running Thanos Compactor instances to prevent further data inconsistencies and errors.
- Investigate and resolve the root cause of the multiple instances running, such as a configuration or deployment error.
- Verify that the system configuration and deployment scripts are corrected to ensure only one instance of Thanos Compactor is running.
- Monitor the system closely to ensure that the issue does not recur.
- Consider implementing additional monitoring and alerting to detect and prevent similar issues in the future.