0

I'm trying to enable hairpin connections on my Kubernetes service, on GKE.

I've tried to follow the instructions here: https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/ to configure my kubelet config to enable hairpin mode, but it looks like my configs are never saved, even though the edit command returns without error.

Here is what I try to set when I edit node:

spec:
  podCIDR: 10.4.1.0/24
  providerID: gce://staging/us-east4-b/gke-cluster-staging-highmem-f36fb529-cfnv
  configSource:
    configMap:
      name: my-node-config-4kbd7d944d
      namespace: kube-system
      kubeletConfigKey: kubelet

Here is my node config when I describe it

Name:         my-node-config-4kbd7d944d
Namespace:    kube-system
Labels:       <none>
Annotations:  <none>

Data
====
kubelet_config:
----
{
  "kind": "KubeletConfiguration",
  "apiVersion": "kubelet.config.k8s.io/v1beta1",
  "hairpinMode": "hairpin-veth"
}

I've tried both using "edit node" and "patch". Same result in that nothing is saved. Patch returns "no changes made."

Here is the patch command from the tutorial:

kubectl patch node ${NODE_NAME} -p "{\"spec\":{\"configSource\":{\"configMap\":{\"name\":\"${CONFIG_MAP_NAME}\",\"namespace\":\"kube-system\",\"kubeletConfigKey\":\"kubelet\"}}}}"

I also can't find any resource on where the "hairpinMode" attribute is supposed to be set.

Any help is appreciated!

------------------- edit ----------------

here is why I think hairpinning isn't working.

root@668cb9686f-dzcx8:/app# nslookup tasks-staging.[my-domain].com
Server:     10.0.32.10
Address:    10.0.32.10#53

Non-authoritative answer:
Name:   tasks-staging.[my-domain].com
Address: 34.102.170.43

root@668cb9686f-dzcx8:/app# curl https://[my-domain].com/python/healthz
hello
root@668cb9686f-dzcx8:/app# nslookup my-service.default
Server:     10.0.32.10
Address:    10.0.32.10#53

Name:   my-service.default.svc.cluster.local
Address: 10.0.38.76

root@668cb9686f-dzcx8:/app# curl https://my-service.default.svc.cluster.local/python/healthz
curl: (7) Failed to connect to my-service.default.svc.cluster.local port 443: Connection timed out

also if I issue a request to localhost from my service (not curl), it gets a "connection refused." Issuing requests to the external domain, which should get routed to the same pod, is fine though.

I only have one service, one node, one pod, and two listening ports at the moment.

--------------------- including deployment yaml ----------------- Deployment

spec:
  replicas: 1
    spec:
      containers:
      - name: my-app
        ports:
        - containerPort: 8080
        - containerPort: 50001
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
            scheme: HTTPS

Ingress:

apiVersion: extensions/v1beta1
kind: Ingress
spec:
  backend:
    serviceName: my-service
    servicePort: 60000
  rules:
  - http:
      paths:
      - path: /*
        backend:
          serviceName: my-service
          servicePort: 60000
      - path: /python/*
        backend:
          serviceName: my-service
          servicePort: 60001

service

---
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  ports:
  - name: port
    port: 60000
    targetPort: 8080
  - name: python-port
    port: 60001
    targetPort: 50001
  type: NodePort

I'm trying to set up a multi-port application where the main program trigger a script to run through issuing a request on the local machine on a different port. (I need to run something in python but the main app is in golang.)

It's a simple script and I'd like to avoid exposing the python endpoints with the external domain, so I don't have to worry about authentication, etc.

-------------- requests sent from my-service in golang -------------

https://[my-domain]/health: success
https://[my-domain]/python/healthz: success
http://my-service.default:60000/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://my-service.default/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://my-service.default:60001/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host 
http://localhost:50001/healthz: dial tcp 127.0.0.1:50001: connect: connection refused 
http://localhost:50001/python/healthz: dial tcp 127.0.0.1:50001: connect: connection refused 
Jason Chen
  • 23
  • 1
  • 5
  • Hello. `Hairpinning` should work without any additional configuration. How did you test to came to conclusion that it doesn't work? – Dawid Kruk Mar 31 '20 at 15:15
  • So from the pod, I can nslookup "[service name].default" which resolves to the same IP address that my external address resolves to. I can query my external address, but when I query [service name] it times out. I added the nslookup+curl outputs. – Jason Chen Apr 01 '20 at 05:01
  • I see in your testing methodology that you are trying locally to `curl https://my-service`. What I think the issue is that you are trying to connect to the pod through a service(`clusterIP`) specifying the wrong port. Could you show the `YAML` of this service that is exposing this pod? – Dawid Kruk Apr 01 '20 at 14:09
  • oh no! that's entirely possible since i'm totally new to kubernetes. I'm including my yaml file as well – Jason Chen Apr 01 '20 at 16:19
  • thank you so much for your help @DawidKruk – Jason Chen Apr 01 '20 at 16:29
  • I see now. I will write an answer with some explanation. Is it possible to: `issuing a request on the local machine on a different port` do as a localhost connection? – Dawid Kruk Apr 01 '20 at 17:28
  • hmm. I'm not sure what you mean? but i'll include the requests I sent and their results too – Jason Chen Apr 01 '20 at 17:47

1 Answers1

0

Kubelet reconfiguration in GKE


You should not reconfigure kubelet in cloud managed Kubernetes clusters like GKE. It's not supported and it can lead to errors and failures.

Hairpinning in GKE


Hairpinning is enabled by default in GKE provided clusters. You can check if it's enabled by invoking below command on one of the GKE nodes:

ifconfig cbr0 |grep PROMISC

The output should look like that:

UP BROADCAST RUNNING PROMISC MULTICAST MTU:1460 Metric:1

Where the PROMISC will indicate that the hairpinning is enabled.

Please refer to official documentation about debugging services: Kubernetes.io: Debug service: a pod fails to reach itself via the service ip

Workload


Basing only on service definition you provided, you should have an access to your python application on port 50001 with a pod hosting it with:

  • localhost:50001
  • ClusterIP:60001
  • my-service:60001
  • NodeIP:nodeport-port (check $ kubectl get svc my-service for this port)

I tried to run your Ingress resource and it failed to create. Please check how Ingress definition should look like.

Please take a look on official documentation where whole deployment process is explained with examples:

Additionally please check other StackOverflow answers like:

Please let me know if you have any questions to that.

Dawid Kruk
  • 8,982
  • 2
  • 22
  • 45
  • hmm, i think ingress should be working because accessing through my external domain works (with url matching). localhost:50001 fails for "connection refused" and "my-service:60001" fails with no such host. I added these result to my post – Jason Chen Apr 02 '20 at 17:49
  • Please invoke command: `kubectl get pods,services,ep` and update your question with it. To clarify. You have one pod with 2 applications running on ports: `8080` and `50001`? If yes, please **exec into it** and try to run `curl localhost:8080` and `curl localhost:50001`. Let me know the results. After that please do the same from another pod but with `curl my-service:6000` and `curl my-service:6001`. Could you tell what's the reason behind you curling the `/python/healthz`? – Dawid Kruk Apr 03 '20 at 12:02
  • 1
    Yea. When I curl directly from inside the pod, `localhost` and `my-service` both work as expected. I'm just using the `/python/healthz` path because that's the handler i implemented. – Jason Chen Apr 03 '20 at 18:02
  • As I said please invoke command: `kubectl get pods,services,ep` and update your question with it. Your service is lacking `selector`. How exactly did you expose your application to the Internet? – Dawid Kruk Apr 07 '20 at 13:12
  • i found the problem and you're right, it has nothing to do with hairpin mode. thank you so much for your help! – Jason Chen Apr 10 '20 at 19:55
  • Happy to help, and welcome to Stack Overflow. If this answer or any other one solved your issue, please mark it as accepted. – Dawid Kruk Apr 10 '20 at 20:04