kube-scheduler and kube-controller-manager restarting

Question

I have kubernetes 1.15.3 setup

My kube-controller & kube-scheduler are restarting very frequently . This is happening after kubernetes is upgraded to 1.15.3.

kubectl get po -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
coredns-5c98db65d4-nmt5d               1/1     Running   37         24d
coredns-5c98db65d4-tg4kx               1/1     Running   37         24d
etcd-ana01                      1/1     Running   1          24d
kube-apiserver-ana01            1/1     Running   10         24d
**kube-controller-manager-ana01   1/1     Running   477        9d**
kube-flannel-ds-amd64-2srzb            1/1     Running   0          12d
kube-proxy-2hvcl                       1/1     Running   0          23d
**kube-scheduler-ana01            1/1     Running   518        9d**
tiller-deploy-8557598fbc-kxntc         1/1     Running   0          11d

Here is the logs of the system


  Type     Reason     Age                   From                   Message
  ----     ------     ----                  ----                   -------
  Warning  Unhealthy  39m (x500 over 23d)   kubelet, ana01  Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
  Warning  BackOff    39m (x1873 over 23d)  kubelet,  ana01  Back-off restarting failed container
  Normal   Pulled     28m (x519 over 24d)   kubelet, ana01  Container image "k8s.gcr.io/kube-scheduler:v1.15.3" already present on machine
  Normal   Created    28m (x519 over 24d)   kubelet, ana01  Created container kube-scheduler
  Normal   Started    27m (x519 over 24d)   kubelet, ana01  Started container kube-scheduler

logs are

I0928 09:10:23.554335       1 serving.go:319] Generated self-signed cert in-memory
W0928 09:10:25.002268       1 authentication.go:387] failed to read in-cluster kubeconfig for delegated authentication: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W0928 09:10:25.002523       1 authentication.go:249] No authentication-kubeconfig provided in order to lookup client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication won't work.
W0928 09:10:25.002607       1 authentication.go:252] No authentication-kubeconfig provided in order to lookup requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
W0928 09:10:25.002947       1 authorization.go:177] **failed to read in-cluster kubeconfig for delegated authorization: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory**
W0928 09:10:25.003116       1 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
I0928 09:10:25.021201       1 server.go:142] Version: v1.15.3

Could you please provide configuration files, `kube-controller-manager.yaml` and `kube-scheduler.yaml`, you can find them in `/etc/kubernetes/manifests`. Additionally provide output from `kubectl edit cm kubeadm-config -n kube-system`. — aga, Oct 01 '19 at 10:37
@abielak Please find all the three files https://drive.google.com/drive/folders/1WRC1TU4DXBgsQVbkDEHoNb536pEjLd0f?usp=sharing — Vishnu Gopal Singhal, Oct 02 '19 at 06:12
I checked your conf file and it looks good. Anyway have you fallowed steps from this documentation https://v1-15.docs.kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade-1-15/ ? — aga, Oct 07 '19 at 07:37
Please, provide logs from your coredns pods, what is the reason of restarts? And have you tried to swap your CNI for Calico? — aga, Oct 17 '19 at 06:10
I havent tried to change the cni to Calico. coredns-5c98db65d4-jcxd5 1/1 Running 2 12d coredns-5c98db65d4-kql2k 1/1 Running 2 12d coredns-5c98db65d4-nmt5d 0/1 Evicted 0 43d coredns-5c98db65d4-tg4kx 0/1 Evicted 0 43d etcd-ana01 1/1 Running 1 43d kube-apiserver-ana01 1/1 Running 10 43d — Vishnu Gopal Singhal, Oct 17 '19 at 12:08
kubectl logs coredns-5c98db65d4-kql2k -n kube-system .:53 2019-10-08T09:25:34.610Z [INFO] CoreDNS-1.3.1 2019-10-08T09:25:34.610Z [INFO] linux/amd64, go1.11.4, 6b56a9c CoreDNS-1.3.1 linux/amd64, go1.11.4, 6b56a9c 2019-10-08T09:25:34.610Z [INFO] plugin/reload: Running configuration MD5 = 5d5369fbc12f985709b924e721217843 — Vishnu Gopal Singhal, Oct 17 '19 at 12:09
I am seeing this same behavior on a 1.20 k8s cluster with calico networking. I believe it has to do with the liveness probe set to 10251 and 10252 (http) rather than the secure ports (10257 and 10259) that are used in newer k8s versions by default. Was there any fix or workaround found for this issue? — mmiara, Feb 17 '21 at 16:29

kube-scheduler and kube-controller-manager restarting

0 Answers0