1

I have configured a web application pod exposed via apache on port 80. I'm unable to configure a service + ingress for accessing from the internet. The issue is that the backend services always report as UNHEALTHY.

Pod Config:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    name: webapp
  name: webapp
  namespace: my-app
spec:
  replicas: 1
  selector:
    matchLabels:
      name: webapp
  template:
    metadata:
      labels:
        name: webapp
    spec:
      containers:
      - image: asia.gcr.io/my-app/my-app:latest
        name: webapp
        ports:
        - containerPort: 80
          name: http-server

Service Config:

apiVersion: v1
kind: Service
metadata:
  name: webapp-service
spec:
  type: NodePort
  selector:
    name: webapp
  ports:
    - protocol: TCP
      port: 50000
      targetPort: 80

Ingress Config:

kind: Ingress
metadata:
  name: webapp-ingress
spec:
  backend:
    serviceName: webapp-service
    servicePort: 50000

This results in backend services reporting as UNHEALTHY.

The health check settings:

Path: /
Protocol: HTTP
Port: 32463
Proxy protocol: NONE

Additional information: I've tried a different approach of exposing the deployment as a load balancer with external IP and that works perfectly. When trying to use a NodePort + Ingress, this issue persists.

  • what kind of health check is that? Are the firewalls allowing traffic to this port? – suren Nov 05 '19 at 08:41
  • @suren - The health checks are configured by the load balancer while creating the ingress. The firewall rules are also created along with this, which permits access. – Praveen Selvam Nov 05 '19 at 08:54
  • that's weird. How long have you waited? Usually it takes about 3-4 minutes to show healthy a backend. – suren Nov 05 '19 at 10:37
  • have you check your firewall rules? Is the port 32463 is open? – guillaume blaquiere Nov 05 '19 at 14:04
  • 32463 should be the node port of the service, the firewall rule should be created automatically for the entire node port range from the health check IPs, this should not be a firewall issue. – Patrick W Nov 05 '19 at 21:03

2 Answers2

1

With GKE, the health check on the Load balancer is created automatically when you create the ingress. Since the HC is created automatically, so are the firewall rules.

Since you have no readinessProbe configured, the LB has a default HC created (the one you listed). To debug this properly, you need to isolate where the point of failure is.

First, make sure your pod is serving traffic properly;

kubectl exec [pod_name] -- wget localhost:80

If the application has curl built in, you can use that instead of wget. If the application has neither wget or curl, skip to the next step.

  1. get the following output and keep track of the output:

    kubectl get po -l name=webapp -o wide
    kubectl get svc webapp-service

You need to keep the service and pod clusterIPs

  1. SSH to a node in your cluster and run sudo toolbox bash

  2. Install curl:

apt-get install curl`

  1. Test the pods to make sure they are serving traffic within the cluster:

curl -I [pod_clusterIP]:80

This needs to return a 200 response

  1. Test the service:

curl -I [service_clusterIP]:80

If the pod is not returning a 200 response, the container is either not working correctly or the port is not open on the pod.

if the pod is working but the service is not, there is an issue with the routes in your iptables which is managed by kube-proxy and would be an issue with the cluster.

Finally, if both the pod and the service are working, there is an issue with the Load balancer health checks and also an issue that Google needs to investigate.

Patrick W
  • 4,603
  • 1
  • 12
  • 26
  • would yo be available to help troubleshoot an odd situation such as this one? Possibly do a screen share? – DropHit May 13 '20 at 06:22
  • 1
    I read that a readiness probe is not working properly when the ingress was existing. for me it also didn't work when redeploying the ingress. Some others reported health checks need to be removed because they do the wrong thing even after modifying. I eventually just had / and /healthz returning 200 and it works. I think this would be remotely troubleshootable.. – Vincent Gerris May 14 '20 at 08:12
0

As Patrick mentioned, the checks will be created automatically by GCP. By default, GKE will use readinessProbe.httpGet.path for the health check.

But if there is no readinessProbe configured, then it will just use the root path /, which must return an HTTP 200 (OK) response (and that's not always the case, for example, if the app redirects to another path, then the GCP health check will fail).

Ahmed AbouZaid
  • 2,151
  • 1
  • 13
  • 9