Error: at least 2 live replicas required across different availability zones, could only find 0 - unhealthy instances #5552
-
When I try to push metrics from the Prometheus remote, I'm getting the below error.
Helm Config:
Despite keeping replication_factor: 3 as suggested, I am still getting this error. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
@pstibrany @pracucci @DylanGuedes I'd appreciate your help on this. |
Beta Was this translation helpful? Give feedback.
-
Are your ingesters running properly? Are they updating their heartbeat in the ring (see /distributor/ring endpoint on distributor pod)? Error indicates that while there are ring entries for ingesters, they are "unhealthy", ie. have old last updated timestamp. Try to focus your investigation on why ingesters don't update their timestamp in the ring. Update: I just noticed very tight timeouts for hearbeat:
You may want to start with default values instead (15s hearbeat period, 1m timeout). |
Beta Was this translation helpful? Give feedback.
-
@pstibrany Thanks for the quick response. Despite making the changes you suggested, I was still unable to resolve the problem and getting the same issue. Prometheus logs:
Mimir Distributor logs:
Furthermore, I am seeing the following logs:
Do you have any suggestions on how to resolve this issue? Also, I'm trying to port-forward and see the member list, but not getting any response.
![]() Is there any way to see memberlist? |
Beta Was this translation helpful? Give feedback.
This issue has been resolved, by making changes to the memberlist.