10

I created a K8s cluster of 5 VMs (1 master and 4 slaves running Ubuntu 16.04.3 LTS) using kubeadm. I used flannel to set up networking in the cluster. I was able to successfully deploy an application. I, then, exposed it via NodePort service. From here things got complicated for me.

Before I started, I disabled the default firewalld service on master and the nodes.

As I understand from the K8s Services doc, the type NodePort exposes the service on all nodes in the cluster. However, when I created it, the service was exposed only on 2 nodes out of 4 in the cluster. I am guessing that's not the expected behavior (right?)

For troubleshooting, here are some resource specs:

root@vm-vivekse-003:~# kubectl get nodes
NAME              STATUS    AGE       VERSION
vm-deepejai-00b   Ready     5m        v1.7.3
vm-plashkar-006   Ready     4d        v1.7.3
vm-rosnthom-00f   Ready     4d        v1.7.3
vm-vivekse-003    Ready     4d        v1.7.3   //the master
vm-vivekse-004    Ready     16h       v1.7.3

root@vm-vivekse-003:~# kubectl get pods -o wide -n playground
NAME                                     READY     STATUS    RESTARTS   AGE       IP           NODE
kubernetes-bootcamp-2457653786-9qk80     1/1       Running   0          2d        10.244.3.6   vm-rosnthom-00f
springboot-helloworld-2842952983-rw0gc   1/1       Running   0          1d        10.244.3.7   vm-rosnthom-00f

root@vm-vivekse-003:~# kubectl get svc -o wide -n playground
NAME        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE       SELECTOR
sb-hw-svc   10.101.180.19   <nodes>       9000:30847/TCP   5h        run=springboot-helloworld

root@vm-vivekse-003:~# kubectl describe svc sb-hw-svc -n playground
Name:               sb-hw-svc
Namespace:          playground
Labels:             <none>
Annotations:        <none>
Selector:           run=springboot-helloworld
Type:               NodePort
IP:                 10.101.180.19
Port:               <unset>   9000/TCP
NodePort:           <unset>   30847/TCP
Endpoints:          10.244.3.7:9000
Session Affinity:   None
Events:             <none>

root@vm-vivekse-003:~# kubectl get endpoints sb-hw-svc -n playground -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  creationTimestamp: 2017-08-09T06:28:06Z
  name: sb-hw-svc
  namespace: playground
  resourceVersion: "588958"
  selfLink: /api/v1/namespaces/playground/endpoints/sb-hw-svc
  uid: e76d9cc1-7ccb-11e7-bc6a-fa163efaba6b
subsets:
- addresses:
  - ip: 10.244.3.7
    nodeName: vm-rosnthom-00f
    targetRef:
      kind: Pod
      name: springboot-helloworld-2842952983-rw0gc
      namespace: playground
      resourceVersion: "473859"
      uid: 16d9db68-7c1a-11e7-bc6a-fa163efaba6b
  ports:
  - port: 9000
    protocol: TCP

After some tinkering I realized that on those 2 "faulty" nodes, those services were not available from within those hosts itself.

Node01 (working):

root@vm-vivekse-004:~# curl 127.0.0.1:30847      //<localhost>:<nodeport>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.101.180.19:9000   //<cluster-ip>:<port>
Hello Docker World!!
root@vm-vivekse-004:~# curl 10.244.3.7:9000      //<pod-ip>:<port>
Hello Docker World!!

Node02 (working):

root@vm-rosnthom-00f:~# curl 127.0.0.1:30847
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.101.180.19:9000
Hello Docker World!!
root@vm-rosnthom-00f:~# curl 10.244.3.7:9000
Hello Docker World!!

Node03 (not working):

root@vm-plashkar-006:~# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-plashkar-006:~# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-plashkar-006:~# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Node04 (not working):

root@vm-deepejai-00b:/# curl 127.0.0.1:30847
curl: (7) Failed to connect to 127.0.0.1 port 30847: Connection timed out
root@vm-deepejai-00b:/# curl 10.101.180.19:9000
curl: (7) Failed to connect to 10.101.180.19 port 9000: Connection timed out
root@vm-deepejai-00b:/# curl 10.244.3.7:9000
curl: (7) Failed to connect to 10.244.3.7 port 9000: Connection timed out

Tried netstat and telnet on all 4 slaves. Here's the output:

Node01 (the working host):

root@vm-vivekse-004:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      27808/kube-proxy
root@vm-vivekse-004:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node02 (the working host):

root@vm-rosnthom-00f:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      11842/kube-proxy
root@vm-rosnthom-00f:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Node03 (the not-working host):

root@vm-plashkar-006:~# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      7791/kube-proxy
root@vm-plashkar-006:~# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

Node04 (the not-working host):

root@vm-deepejai-00b:/# netstat -tulpn | grep 30847
tcp6       0      0 :::30847                :::*                    LISTEN      689/kube-proxy
root@vm-deepejai-00b:/# telnet 127.0.0.1 30847
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection timed out

Addition info:

From the kubectl get pods output, I can see that the pod is actually deployed on slave vm-rosnthom-00f. I am able to ping this host from all the 5 VMs and curl vm-rosnthom-00f:30847 also works from all the VMs.

I can clearly see that the internal cluster networking is messed up, but I am unsure how to resolve it! iptables -L for all the slaves are identical, and even the Local Loopback (ifconfig lo) is up and running for all the slaves. I'm completely clueless as to how to fix it!

Vivek Sethi
  • 894
  • 2
  • 13
  • 20
  • Just for confirmation, do the IP address of all the non-docker interfaces have a separate IP space than docker, pods, and services? The command I'd want to see is `root@vm-deepejai-00b:/# curl THE_IP_OF_vm-vivekse-004:30847` to ensure `vm-deepejai-00b` can conceivably route traffic to `vm-vivekse-004`, because that's what is happening under the covers anyway – mdaniel Aug 09 '17 at 16:44
  • Also, just for extreme clarity, did you check `iptables -t nat -L` as well as just `iptables -L` (I couldn't tell if that's what you meant) – mdaniel Aug 09 '17 at 16:45
  • @MatthewLDaniel As for your first comment, the curl works: `root@vm-deepejai-00b:~# curl 173.36.23.4:30847 Hello Docker World!!` where 173.36.23.4 is the IP of vm-vivekse-004 – Vivek Sethi Aug 09 '17 at 17:24
  • @MatthewLDaniel As for your second comment, I think I went a little overboard when I said `iptables -L` for all slaves are identical. Actually, `iptables -L` is identical for Node02 (working node) and Node04 (non-working node) [https://www.diffchecker.com/JZzyspEL ] and identical for Node01 (working node) and Node03 (non-working node) [https://www.diffchecker.com/3X6WkdMR ] – Vivek Sethi Aug 09 '17 at 17:36
  • Also, `iptables -t nat -L` is *almost* identical for Node02 & Node04 [https://www.diffchecker.com/me6PhHCd ] and identical for Node01 and Node03 [https://www.diffchecker.com/CusUUMnN ] – Vivek Sethi Aug 09 '17 at 17:50
  • Can you check ` sysctl net.ipv4.ip_forward` on all the nodes? it should be set to 1. – sfgroups Aug 14 '17 at 01:17
  • Would you include the output of `kubectl get pod -n kube-system -o wide` please? – Janos Lenart Aug 28 '17 at 11:39
  • If you can, try updating to a more recent version of Kubernetes 1.7.x and flannel. I found a few issues in their GitHub repo pertaining to how kube-proxy and flannel's iptable rules could conflict. – gtirloni Oct 12 '17 at 09:26
  • what about your iptables FORWARD chain? Could you post its rules? – Konstantin Vustin Jun 26 '18 at 11:15
  • https://serverfault.com/questions/1010978/cannot-access-to-kubernetes-nodeport-from-other-worker-nodes-except-the-pods-on – kinjelom Sep 21 '22 at 12:46

3 Answers3

0

Use a service type NodePort and access the NodePort if the Ipadress of your Master node.

The Service obviously knows on which node a Pod is running and redirect the traffic to one of the pods if you have several instances.

Label your pods and use the corrispondent selectors in the service.

If you get still into issues please post your service and deployment.

To check the connectivity i would suggest to use netcat.

nc -zv ip/service port

if network is ok it responds: open

inside the cluster access the containers like so:

nc -zv servicename.namespace.svc.cluster.local port

Consider always that you have 3 kinds of ports.

Port on which your software is running in side your container.

Port on which you expose that port to the pod. (a pod has one ipaddress, the clusterIp address, which is use by a container on a specific port)

NodePort wich allows you to access the pods ipaddress ports from outside the clusters network.

Ralle Mc Black
  • 1,065
  • 1
  • 8
  • 16
-2

Either your firewall blocks some connections between nodes or your kube-proxy is not working properly. I guess your services work only on nodes where pods are running on.

Kamil
  • 431
  • 3
  • 8
-4

If you want to reach the service from any node in the cluster you need fine service type as ClusterIP. Since you defined service type as NodePort, you can connect from the node where service is running.


my above answer was not correct, based on documentation we should be able to connect from any NodeIP:Nodeport. but its not working in my cluster also.

https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services---service-types

NodePort: Exposes the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created. You’ll be able to contact the NodePort service, from outside the cluster, by requesting :.

One of my node ip forward not set. I was able to connect my service using NodeIP:nodePort

sysctl -w net.ipv4.ip_forward=1
sfgroups
  • 18,151
  • 28
  • 132
  • 204