1

I had this happen to me a couple of times:

I change something in /etc/kubernetes/manifests/kube-apiserver.yaml and check for the API server process. I see that Docker container exited with error code 1. I check the logs of the container and all there is is a single line with:

Shutting down, got signal: Terminated

I don’t know where to begin in troubleshooting this as there’s nowhere to start. In a lab environment I just recreate the cluster but I’m afraid this might happen in a production environment.

How can I troubleshoot a kube-apiserver that fails to start like this (with no exit reason besides the code) and that is deployed with kubeadm and such is running in a container?

user3063045
  • 2,019
  • 2
  • 22
  • 35
  • Restarting the kubelet service seems to have resolved the issue. The container was stuck in a bad state and there were no restarts until after I restarted the kubelet service. – user3063045 Jan 25 '22 at 14:19
  • Yes, usually restarting `kubelet` solves a lot of issues with stuck pods or anything related to scheduling pods. See [what kubelet is](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/). So does it work normally after that? If anything's changed again? – moonkotte Jan 26 '22 at 10:00

1 Answers1

2

I had a similar issue with pods of my kube-system randomly being terminated, preventing the cluster from starting.

I found the solution here : https://discuss.kubernetes.io/t/why-does-etcd-fail-with-debian-bullseye-kernel/19696

TLDR: It was a cgroup driver/version problem. Kubelet had a hard time figuring out what's running versus what should be running and was killing my legitimate pods. The following config fixed the thing :

# Content of file /etc/containerd/config.toml
version = 2
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
   [plugins."io.containerd.grpc.v1.cri".containerd]
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true
JulienCC
  • 446
  • 2
  • 11