I have ICP V2.1 installed into a RHEL VMWare image. After rebooting the image, ICP fails to start in what appears to be the first known issue in the documentation (Kubernetes controller manager fails to start after a master or cluster restart). However, the prescribed resolution does not get my system going.
Here is the running pod list:
NAME READY STATUS RESTARTS AGE calico-node-amd64-dtl47 2/2 Running 14 20h filebeat-ds-amd64-mvcsj 1/1 Running 8 20h k8s-etcd-192.168.232.131 1/1 Running 7 20h k8s-mariadb-192.168.232.131 1/1 Running 7 20h k8s-master-192.168.232.131 2/3 CrashLoopBackOff 15 17m k8s-proxy-192.168.232.131 1/1 Running 7 20h metering-reader-amd64-gkwt4 1/1 Running 7 20h monitoring-prometheus-nodeexporter-amd64-sghrv 1/1 Running 7 20h
Removing the k8s-master-192.168.232.131 pod and allowing it to restart only puts it back into the CrashLoopBackOff state. Here is how the last line in controller manager log looks:
F1029 23:55:07.345341 1 controllermanager.go:176] error building controller context: failed to get supported resources from server: unable to retrieve the complete list of server APIs: servicecatalog.k8s.io/v1alpha1: an error on the server ("Error: 'dial tcp 10.0.0.145:443: getsockopt: connection refused'\nTrying to reach: 'https://10.0.0.145:443/apis/servicecatalog.k8s.io/v1alpha1'") has prevented the request from succeeding
Removing the pod or removing the failed controller master docker container directly has no effect. It seems like another service hasn't started yet, or failed to start. I've waited several hours to see if the issue resolves itself, but to no avail.
Thanks...