8

I need some help on debugging the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. Can someone please help?

I am trying to run a pod on Mac (first) using Docker Desktop flavor of Kubernetes, and the version is 2.1.0.1 (37199). I'd like to try using hostNetwork mode because of its efficiency and the number of ports that need to be opened (in thousands). With only hostNetwork: true set, there is no error but I also don't see the ports being opened on the host, nor the host network interface inside the container. Since I also needs to open port 443, I added the capability of NET_BIND_SERVICE and that is when it started throwing the error.

I've run lsof -i inside the container (ubuntu:18.04) and then sudo lsof -i on my Mac, and I saw no conflict. Then, I've also looked at /var/lib/log/containers/kube-apiserver-docker-desktop_kube-system_kube-apiserver-*.log and I saw no clue. Thanks!

Additional Info: I've run the following inside the container:

# ss -nltp
State  Recv-Q  Send-Q     Local Address:Port      Peer Address:Port
LISTEN 0       5                0.0.0.0:10024          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=28))
LISTEN 0       5                0.0.0.0:2443           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=24))
LISTEN 0       5                0.0.0.0:10000          0.0.0.0:*      users:(("pnnsvr",pid=1,fd=27))
LISTEN 0       50               0.0.0.0:6800           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=14))
LISTEN 0       1                0.0.0.0:6802           0.0.0.0:*      users:(("pnnsvr",pid=1,fd=13))
LISTEN 0       50               0.0.0.0:443            0.0.0.0:*      users:(("pnnsvr",pid=1,fd=15))

Then, I ran netstat on my Mac (the host) and searched for those ports and I can't find a collision. I'm happy to supply the output of netstat (767 lines) if needed.

Here is the yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pnnsvr
  labels:
    app: pnnsvr
    env: dev
spec:
  replicas: 1
  selector:
    matchLabels:
      app: pnnsvr
      env: dev
  template:
    metadata:
      labels:
        app: pnnsvr
        env: dev
    spec:
      hostNetwork: true
      containers:
      - name: pnnsvr
        image: dev-pnnsvr:0.92
        args: ["--root_ip=192.168.15.194"]
        # for using local images
        imagePullPolicy: Never
        ports:
        - name: https
          containerPort: 443
          hostPort: 443
        - name: cport6800tcp
          containerPort: 6800
          hostPort:  6800
          protocol: TCP
        - name: cport10000tcp
          containerPort: 10000
          hostPort: 10000
          protocol: TCP
        - name: cport10000udp
          containerPort: 10000
          hostPort: 10000
          protocol: UDP
        - name: cport10001udp
          containerPort: 10001
          hostPort: 10001
          protocol: UDP
        #test
        - name: cport23456udp
          containerPort: 23456
          hostPort: 23456
          protocol: UDP
        securityContext:
          capabilities:
            add:
              - SYS_NICE
              - NET_BIND_SERVICE
              - CAP_SYS_ADMIN
skwokie
  • 425
  • 1
  • 6
  • 20
  • please use `netstat -ltnp, ss -nltp` instead or `lsof -i :8080` to verify specific port – Mark Aug 23 '19 at 13:29
  • please share with your yaml/config/settings for `NET_BIND_SERVICE` and `hostNetwork` and your deployment spec. – Mark Aug 23 '19 at 13:56
  • I've added the new info. Thanks, @Hanx. – skwokie Aug 23 '19 at 21:01
  • Just want to give an update that I just tested it one more time by setting `hostNetwork: true` and `NET_BIND_SERVICE` (as shown in the yaml above) and this time it generated no error, but the ports were also not opened in the host network interface. – skwokie Aug 27 '19 at 18:25

3 Answers3

5

I've accidentally resolved this and I've done it by bouncing the pod instead of using kubectl apply -f .... Soon after bouncing the pod, the new pod will become a go. My theory is that Kubernetes will bring up a new pod and get it all ready to go first before killing the old pod. Since the old pod still has the ports opened, the new pod will see those ports taken and thus the error: 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports is triggered.

skwokie
  • 425
  • 1
  • 6
  • 20
  • 1
    this problem can be solved by setting `maxSurge: 1` so that deployment will destroy the old one and create the new instantly. see config example: ``` strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 1 ``` – Yilmaz Guleryuz Dec 26 '19 at 18:43
2

I don't have possibility to setup this on docker for mac but it seems you should verify your ports in your Docker VM:

screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty

Please consider if you can change default node port range (--service-node-port-range portRange-Default: 30000-32767) Here you can find great post how to do it in docker for mac

Please remember that using the hostNetwork: true it is not good solution according to the best practices.

As per documentation:

Don’t specify a hostPort for a Pod unless it is absolutely necessary. When you bind a Pod to a hostPort, it limits the number of places the Pod can be scheduled, because each combination must be unique. If you don’t specify the hostIP and protocol explicitly, Kubernetes will use 0.0.0.0 as the default hostIP and TCP as the default protocol. Avoid using hostNetwork, for the same reasons as hostPort.

If you explicitly need to expose a Pod’s port on the node, consider using a NodePort Service before resorting to hostPort.

Configure a Security Context for a Pod or Container https://kubernetes.io/docs/tasks/configure-pod-container/security-context/

To specify security settings for a Container, include the securityContext field in the Container manifest. The securityContext field is a SecurityContext object. Security settings that you specify for a Container apply only to the individual Container, and they override settings made at the Pod level when there is overlap. Container settings do not affect the Pod’s Volumes.

Please note in addition for securityContext for POD:

The runAsGroup field specifies the primary group ID of 3000 for all processes within any containers of the Pod. If this field is omitted, the primary group ID of the containers will be root(0)

please let me know if it helped.

Mark
  • 3,644
  • 6
  • 23
  • hi @hanx, thanks for your answer. Yes, I can't agree more that [docker for mac](https://stackoverflow.com/questions/57582980/how-to-change-the-default-nodeport-range-on-mac-docker-desktop/57588995#57588995) is a great post, and I've kept in mind to try to pay it forward. I've followed it to change the port range to 443 - 32767. In addition, thanks for your advice. Does it mean that it is a "double hostnetwork" design - the container ports will be exposed to the vm first and then mapped to the host? – skwokie Aug 30 '19 at 17:15
  • Basically, I am exploring my options here. I understand that hostnetwork is not a preferred way, and I've tried to use nodeport and loadbalancer, but I've run into the [262144 characters limitation](https://github.com/kubernetes/kubernetes/issues/81852). The server that I'm trying to provision opens 10000 ports for device connections via TCP and UDP, and it also needs to open a port for https; and that makes the declaration huge and – skwokie Aug 30 '19 at 17:16
  • please follow [Use host networking](https://docs.docker.com/network/host/) `The host networking driver only works on Linux hosts, and is not supported on Docker Desktop for Mac, Docker Desktop for Windows, or Docker EE for Windows Server.` – Mark Sep 09 '19 at 08:47
0

I had that problem message too. In my case was the use of port in the definition of container:

 hostPort: 80
 hostIP: 127.0.0.1

I removed theses definitions and its working. In my scenario I was testing the HPA and the replicas didn't Running because of don't have free ports, like your problem. Only one pod it was running, the first that ran, the others (replicas), it was in the Pending state.

My solution was to use the service NodePort to expose the port to the host and remove the hostPort and hostIP.

Alex Ferreira
  • 38
  • 1
  • 10