8

I'm trying to setup a very simple 2 node k8s 1.13.3 cluster in a vSphere private cloud. The VMs are running Ubuntu 18.04. Firewalls are turned off for testing purposes. yet the initialization is failing due to a refused connection. Is there something else that could be causing this other than ports being blocked? I'm new to k8s and am trying to wrap my head around all of this.

I've placed a vsphere.conf in /etc/kubernetes/ as shown in this gist. https://gist.github.com/spstratis/0395073ac3ba6dc24349582b43894a77

I've also created a config file to point to when I run kubeadm init. Here's the example of it'\s content. https://gist.github.com/spstratis/086f08a1a4033138a0c42f80aef5ab40

When I run sudo kubeadm init --config /etc/kubernetes/kubeadminitmaster.yaml it times out with the following error.

[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.

Checking sudo systemctl status kubelet shows me that the kubelet is running. I have the firewall on my master VM turned off for now for testing puposes so that I can verify the cluster will bootstrap itself.

   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sat 2019-02-16 18:09:58 UTC; 24s ago
     Docs: https://kubernetes.io/docs/home/
 Main PID: 16471 (kubelet)
    Tasks: 18 (limit: 4704)
   CGroup: /system.slice/kubelet.service
           └─16471 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cloud-config=/etc/kubernetes/vsphere.conf --cloud-provider=vsphere --cgroup-driver=systemd --network-plugin=cni --pod-i

Here are some additional logs below showing that the connection to https://192.168.0.12:6443/ is refused. All of this seems to be causing kubelet to fail and prevent the init process from finishing.

    Feb 16 18:10:22 k8s-master-1 kubelet[16471]: E0216 18:10:22.633721   16471 kubelet.go:2266] node "k8s-master-1" not found
    Feb 16 18:10:22 k8s-master-1 kubelet[16471]: E0216 18:10:22.668213   16471 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:453: Failed to list *v1.Node: Get https://192.168.0.12:6443/api/v1/nodes?fieldSelector=metadata.name%3Dk8s-master-1&limit=500&resourceVersion=0: dial tcp 192.168.0.1
Feb 16 18:10:22 k8s-master-1 kubelet[16471]: E0216 18:10:22.669283   16471 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.0.12:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.0.12:6443: connect: connection refused
    Feb 16 18:10:22 k8s-master-1 kubelet[16471]: E0216 18:10:22.670479   16471 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.0.12:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-master-1&limit=500&resourceVersion=0: dial tcp 192.1
    Feb 16 18:10:22 k8s-master-1 kubelet[16471]: E0216 18:10:22.734005   16471 kubelet.go:2266] node "k8s-master-1" not found
Stavros_S
  • 2,145
  • 7
  • 31
  • 75

3 Answers3

17

In order to address the error (dial tcp 127.0.0.1:10248: connect: connection refused.), run the following:

sudo mkdir /etc/docker
cat <<EOF | sudo tee /etc/docker/daemon.json
{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF
sudo systemctl enable docker
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo kubeadm reset
sudo kubeadm init

Use the same commands if the same error occurs while configuring worker node.

Jeremy Caney
  • 7,102
  • 69
  • 48
  • 77
Kirik
  • 186
  • 1
  • 4
  • 3
    Can you offer additional explanation of _what_ these commands do? That will help future readers understand how to solve _similar_ (but not _identical_) problems. It will also help readers feel more comfortable executing these on their system, if they first understand what they do. – Jeremy Caney Nov 02 '21 at 00:12
  • I think it is documented around here https://kubernetes.io/docs/setup/production-environment/container-runtimes/ – MTom Jan 30 '22 at 14:11
  • @Stavros_S Did you check if swap if disabled or not? I had the same issue and later realized swap was on and had to disable it to get it running. :) – Santosh Bitra Mar 24 '22 at 10:17
3

You cannot use bootstrap-kubeconfig to initialize the master's kubelet since -- as you are experiencing -- it has no api server to contact in order to generate its private key and certificate. Catch-22. I am about 80% certain that removing the --bootstrap-kubeconfig from the kubelet args will help that situation. I would expect that the kubelet already has its key and cert in /var/lib/kubelet/pki, so that might be worth checking, too.

Also, assuming you are using the /etc/kubernetes/manifests directory to run the apiserver and controllermanager, ensure that staticPodPath: in /var/lib/kubelet/config.yaml points to the right directory. It's unlikely to be the problem, but it's super cheap to check.

mdaniel
  • 31,240
  • 5
  • 55
  • 58
  • Thanks for your answer, I verified that kubelet's cert and key are in fact in the PKI folder. I'm honestly not sure what you mean when you say I can't use bootstrap-kubeconfig though. At this point all I have done is placed the config files I mentioned in their folders and ran the kubeadm init command on the master. Was there more I was supposed to do before that? I'm really new to k8s and was following a guide. – Stavros_S Feb 17 '19 at 02:13
  • to follow up, I also checked `/var/lib/kubelet/config.yaml` and noticed it didn't contain a `staticPodPath` property. – Stavros_S Feb 17 '19 at 02:38
  • _why I can't use bootstrap-kubeconfig though_ because that mechanism is designed for (plus or minus) bare metal deployments, such as using an autoscaling group in AWS (or its GCP equivalent) -- with just `kubelet` and `kubeadm` binaries, you can take an empty machine and have it join the cluster by contacting the API to get the cluster config, and then using the bootstrap token to auth long enough to request a certificate pair from the API. That's the "bootstrap-kubeconfig" part: auth using a token and then _generate_ the real `kubelet.conf` from it. But, no API server, no bootstrap process – mdaniel Feb 17 '19 at 17:43
  • So should I remove `--config /etc/kubernetes/kubeadminitmaster.yaml` from my init command? – Stavros_S Feb 17 '19 at 22:38
  • It seems that kubelet isn't able to start any of the containers that it needs. – Stavros_S Feb 18 '19 at 00:44
  • So I ran `sudo kubeadm reset && sudo kubeadm init –pod-network-cidr=10.244.0.0/16` and the initialization worked fine. Would I be able to now set up my cloud provider configuration? – Stavros_S Feb 18 '19 at 01:31
  • Sure, but your question is turning into something **waaay** different from "connection refused to :10248" so I would suggest opening a new question for your new problem, because there is no way a solution will fit in these comments – mdaniel Feb 18 '19 at 01:49
  • I guess looking back at your original answer, how would I actually remove --bootstrap-kubeconfig from the kubelet args? – Stavros_S Feb 18 '19 at 01:50
  • `vi /lib/systemd/system/kubelet.service` and/or `vi /etc/systemd/system/kubelet.service.d/10-kubeadm.conf` (I don't know which of those files contains the offending argument). Given your extreme lack of interest in debugging your own problem, are you sure you wouldn't be happier with a more plug-and-play solution like [kubespray](https://github.com/kubernetes-sigs/kubespray/tree/v2.8.3)? – mdaniel Feb 18 '19 at 02:04
  • I've been trying to debug the issues but what you were telling me was far over my head since I'm really new to linux, k8s, containers and vSphere. Really short time frame to get something up and running and have been trying to follow docs and guides but have obviously been struggling...Appreciate all the help you have tried to provide though :) – Stavros_S Feb 18 '19 at 02:14
0

Try adding the below into vi /etc/docker/daemon.json

{ "exec-opts": ["native.cgroupdriver=systemd"] }

and then a systemctl restart docker.

Also, yes, daemon.json in /etc/docker would be something new that you'd create and it doesn't exist there by default.. :)

  • just a reminder again: Check if swap if disabled or not? I had the same issue and later realized swap was on and had to disable it to get it running. :) – Santosh Bitra Mar 24 '22 at 10:17