2

I am still learning Kubernetes clustering and I have got this problem:

I have kubernetes cluster with 3 nodes on OpenSuse 15.4 and attempting to install Rancher, but helm is failing to get and install cert-manager..

helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --                   version v1.7.1 --set startupapicheck.timeout=10m --set webhook.timeoutSeconds=30 --debug

install.go:194: [debug] Original chart version: "v1.7.1"
install.go:211: [debug] CHART PATH: /home/<cut>/.cache/helm/repository/cert-manager-v1.7.1.tgz

client.go:133: [debug] creating 1 resource(s)
client.go:133: [debug] creating 39 resource(s)
client.go:477: [debug] Starting delete for "cert-manager-startupapicheck" ServiceAccount
client.go:481: [debug] Ignoring delete failure for "cert-manager-startupapicheck" /v1, Kind=ServiceAccount: serviceaccounts "cert-                   manager-startupapicheck" not found
client.go:133: [debug] creating 1 resource(s)
client.go:477: [debug] Starting delete for "cert-manager-startupapicheck:create-cert" Role
client.go:481: [debug] Ignoring delete failure for "cert-manager-startupapicheck:create-cert" rbac.authorization.k8s.io/v1, Kind=R                   ole: roles.rbac.authorization.k8s.io "cert-manager-startupapicheck:create-cert" not found
client.go:133: [debug] creating 1 resource(s)
client.go:477: [debug] Starting delete for "cert-manager-startupapicheck:create-cert" RoleBinding
client.go:481: [debug] Ignoring delete failure for "cert-manager-startupapicheck:create-cert" rbac.authorization.k8s.io/v1, Kind=R                   oleBinding: rolebindings.rbac.authorization.k8s.io "cert-manager-startupapicheck:create-cert" not found
client.go:133: [debug] creating 1 resource(s)
client.go:477: [debug] Starting delete for "cert-manager-startupapicheck" Job
client.go:481: [debug] Ignoring delete failure for "cert-manager-startupapicheck" batch/v1, Kind=Job: jobs.batch "cert-manager-sta                   rtupapicheck" not found
client.go:133: [debug] creating 1 resource(s)
client.go:703: [debug] Watching for changes to Job cert-manager-startupapicheck with timeout of 5m0s
client.go:731: [debug] Add/Modify event for cert-manager-startupapicheck: ADDED
client.go:770: [debug] cert-manager-startupapicheck: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:731: [debug] Add/Modify event for cert-manager-startupapicheck: MODIFIED
client.go:770: [debug] cert-manager-startupapicheck: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: INSTALLATION FAILED: failed post-install: timed out waiting for the condition
helm.go:84: [debug] failed post-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
        /home/abuild/rpmbuild/BUILD/helm-3.11.1/cmd/helm/install.go:141
github.com/spf13/cobra.(*Command).execute
        /home/abuild/rpmbuild/BUILD/helm-3.11.1/vendor/github.com/spf13/cobra/command.go:916
github.com/spf13/cobra.(*Command).ExecuteC
        /home/abuild/rpmbuild/BUILD/helm-3.11.1/vendor/github.com/spf13/cobra/command.go:1044
github.com/spf13/cobra.(*Command).Execute
        /home/abuild/rpmbuild/BUILD/helm-3.11.1/vendor/github.com/spf13/cobra/command.go:968
main.main
        /home/abuild/rpmbuild/BUILD/helm-3.11.1/cmd/helm/helm.go:83
runtime.main
        /usr/lib64/go/1.18/src/runtime/proc.go:250
runtime.goexit
        /usr/lib64/go/1.18/src/runtime/asm_amd64.s:1571

Any help would be greatly appreciated.

I was trying to reinstall whole cluster, but no joy.

MB001
  • 23
  • 6
  • Please add to the core of the question: 1. ```kubectl -n cert-manager get pods```; 2.```kubectl -n cert-manager get events```. Also it would be important to find the Pod generated by the Job *cert-manager-startupapicheck* and understand with a describe/look at the logs. – glv Mar 29 '23 at 12:16
  • Thank you @glv , here you have it: `kubectl -n cert-manager get pods NAME READY STATUS RESTARTS AGE cert-manager-6d6bb4f487-46tcj 0/1 ContainerCreating 0 153m cert-manager-cainjector-7d55bf8f78-smkd8 0/1 ContainerCreating 0 153m cert-manager-startupapicheck-wbz7b 0/1 ContainerCreating 0 153m cert-manager-webhook-5c888754d5-l65bj 0/1 ContainerCreating 0 153m` – MB001 Mar 29 '23 at 14:00
  • `kubectl -n cert-manager get events Warning FailedCreatePodSandBox pod/cert-manager-6d6bb4f487-46tcj (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_cert-manager-6d6bb4f487-46tcj_cert-manager_7ead309e-99f8-43e8-be20-b9baa4c8337e_0(0b6e71726ca63781d3b04b452fc0a2fef8dde9060f313916da40c35b41d5e581): error adding pod cert-manager_cert-manager-6d6bb4f487-46tcj to CNI network "cbr0": loadFlannelSubnetEnv failed: open /run/flannel/subnet.env: no such file or directory` – MB001 Mar 29 '23 at 14:03
  • Have you installed the network plugin? – glv Mar 29 '23 at 14:25
  • `kubectl logs cert-manager-startupapicheck-wbz7b -n cert-manager Error from server (BadRequest): container "cert-manager" in pod "cert-manager-startupapicheck-wbz7b" is waiting to start: ContainerCreating` – MB001 Mar 29 '23 at 14:27
  • Yep, kube-flannel.. – MB001 Mar 29 '23 at 14:29

2 Answers2

0

Based on the evidence found in the comments, it would appear that something went wrong while configuring the network plugin.

Is it possible that you installed Flannel before finishing the cluster setup? Maybe before the kubeadm init.

I suggest doing a clean installation of the CNI and your applications.

Here a problem very similar to yours: kubernetes installation and kube-dns: open /run/flannel/subnet.env: no such file or directory

glv
  • 994
  • 1
  • 1
  • 15
  • Hey, Can I use Calico/Cilium instead or it have to be Flannel with cert-manager? Which method will be best for Rancher? – MB001 Mar 29 '23 at 16:30
  • There is a lot of docs for this topic.. For example: https://docs.tigera.io/calico/latest/getting-started/kubernetes/rancher – glv Mar 29 '23 at 16:32
  • Hmm.. same efect with Calico, helm is hunging for 5min then end. – MB001 Mar 29 '23 at 17:32
  • But have you try to re-install cluster from scratch? Maybe you missed some steps. – glv Mar 29 '23 at 20:14
0

Thanks again! You were right, I had to reset whole cluster and CNI with proper --pod-network-cidr=10.244.0.0/16, then deploy flannel and cert-manager, without errors!

MB001
  • 23
  • 6