0

In a GKE cluster with version 1.16.6-gke.12 from the rapid channel the kubedns container of the kube-dns-... pods of the kube-dns service fail permanently due to

kubedns     15 Mar 2020, 21:43:54   F0315 20:43:54.029575 1 server.go:61] Failed to create a kubernetes client: open /var/run/secrets/kubernetes.io/serviceaccount/token: permission denied
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029539 1 dns.go:48] version: 1.15.8
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029524 1 flags.go:52] FLAG: --vmodule=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029517 1 flags.go:52] FLAG: --version="false"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029512 1 flags.go:52] FLAG: --v="2"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029506 1 flags.go:52] FLAG: --stderrthreshold="2"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029500 1 flags.go:52] FLAG: --profiling="false"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029495 1 flags.go:52] FLAG: --nameservers=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029490 1 flags.go:52] FLAG: --logtostderr="true"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029485 1 flags.go:52] FLAG: --log-flush-frequency="5s"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029479 1 flags.go:52] FLAG: --log-dir=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029466 1 flags.go:52] FLAG: --log-backtrace-at=":0"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029460 1 flags.go:52] FLAG: --kubecfg-file=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029439 1 flags.go:52] FLAG: --kube-master-url=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029432 1 flags.go:52] FLAG: --initial-sync-timeout="1m0s"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029426 1 flags.go:52] FLAG: --healthz-port="8081"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029419 1 flags.go:52] FLAG: --federations=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029412 1 flags.go:52] FLAG: --domain="cluster.local."
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029404 1 flags.go:52] FLAG: --dns-port="10053"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029398 1 flags.go:52] FLAG: --dns-bind-address="0.0.0.0"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029390 1 flags.go:52] FLAG: --config-period="10s"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029382 1 flags.go:52] FLAG: --config-map-namespace="kube-system"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029373 1 flags.go:52] FLAG: --config-map=""
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029363 1 flags.go:52] FLAG: --config-dir="/kube-dns-config"
kubedns     15 Mar 2020, 21:43:54   I0315 20:43:54.029288 1 flags.go:52] FLAG: --alsologtostderr="false"

Is there a workaround this. Where should I report this?

Version information:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:14:22Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-gke.12", GitCommit:"74e2d6182ba7947983ec6d59776c38c53b086a37", GitTreeState:"clean", BuildDate:"2020-02-27T18:38:03Z", GoVersion:"go1.13.4b4", Compiler:"gc", Platform:"linux/amd64"}
Kalle Richter
  • 8,008
  • 26
  • 77
  • 177
  • Did you manage to find any workaround ? – Malgorzata Mar 30 '20 at 11:06
  • @MaggieO No. Last thing I did was trying the command for cert-manager in your answer with the result in https://stackoverflow.com/questions/60697549/kube-dns-container-crashes-due-to-failed-to-create-a-kubernetes-client-open-v?noredirect=1#comment107478139_60705890. No success. – Kalle Richter Mar 30 '20 at 11:09
  • I was facing this on kubernetes version 1.21. This worked out for me - https://github.com/aws/amazon-eks-pod-identity-webhook/issues/8#issuecomment-531343852 – Shrey Dec 01 '21 at 13:15

1 Answers1

1

New GKE clusters now use Kubernetes version 1.14 by default. GKE now offers Kubernetes 1.17 in preview, which requires requesting access from Google Cloud to use. Analogically if there will be release of GKE which will use Kubernetes 1.18 - which solves problem with service account (kubernetes.io/docs/setup/release/notes - "Fixes service account token admission error in clusters that do not run the service account token controller- admission) - this GKE version will at the same time solve your problem.

See: kubernetes-1.18, new-kubernetes-release.

Malgorzata
  • 6,409
  • 1
  • 10
  • 27
  • Thanks for you input. I'm experiencing this issue on GKE where a manipulation of `manifests` isn't possible afaik. I tried both `true` and `false` for `automountServiceAccountToken` (on the `spec.template.spec` of the `kube-dns` deployment), but the newly created pods fail due to the same issue (container `kubedns` crashing due to the same error). – Kalle Richter Mar 17 '20 at 18:25
  • I have updated my answer, please take a look at point 3. Does it help ? – Malgorzata Mar 18 '20 at 09:26
  • Thanks for your ideas. I created the clusterrolebinding successfully for the service account I'm using and another one for the service account kube-system:kube-dns as well. The issue still occurs. I also deleted the kube-dns deployment which is then recreated by GKE: same result. Regarding the command order, the hint is referrring to (a the cert-manager installation command): it doesn't seem to matter because as soon as the cluster accepts `kubectl` commands the `kube-dns` deployment is already started. – Kalle Richter Mar 18 '20 at 20:17
  • You will have to wait for release gke for v1.18. Take a look: https://kubernetes.io/docs/setup/release/notes/ - "Fixes service account token admission error in clusters that do not run the service account token controller (#87029, @liggitt)". Do you need some more information ? – Malgorzata May 05 '20 at 14:38
  • Thanks. If you reference the issue and the k8s version in the question with the explanation why it solves the problem, I can mark it solved. – Kalle Richter May 05 '20 at 14:48
  • Since this seems to be a bug which has a fix, it'd be nice if it'd be backported to 1.17 afaik. Do you have a reference of the issue? Would I report this against Kubernetes or Google Kubernetes Engine? – Kalle Richter May 21 '20 at 14:57