Without more details it is hard to determine the exact cause.
As first point I want to mention, that your error message is Some backend services are in UNHEALTHY state
, not All backend services are in UNHEALTHY state
. It indicates that only a few of your backends are affected.
There might be tons of reasons, if you are using GCP Ingress
or Nginx Ingress
, your configuration of externalTrafficPolicy
, if you are using preemptive nodes
, your livenessProbe
and readinessProbe
, health checks
, etc.
In your scenario, only a few backends are affected, the only thing with current information I can suggest you some debug options.
- Using
$ kubectl get po -n <namespace>
check if all your pods are working correctly, that all containers within pods are Ready
and pod status is Running
. Eventually check logs of suspicious pod $ kubectl logs <podname> -c <containerName>
. In general you should check all pods the load balancer is pointing to,
- Confirm if
livenessProbe
and readinessProbe
are configured properly and response is 200
,
- Describe your ingress
$ kubectl describe ingress <yourIngressName>
and check backends
,
- Check if you've configured your
health checks
properly according to GKE Ingress for HTTP(S) Load Balancing - Health Checks guide.
If you still won't be able to solve this issue with above debug options, please provide more details about your env with logs (without private information).
Useful links: