10

I'm trying to set up a K3s cluster. When I had a single master and agent setup cert-manager had no issues. Now I'm trying a 2 master setup with embedded etcd. I opened TCP ports 6443 and 2379-2380 for both VMs and did the following:

VM1: curl -sfL https://get.k3s.io | sh -s server --token TOKEN --cluster-init
VM2: curl -sfL https://get.k3s.io | sh -s server --token TOKEN --server https://MASTER_IP:6443
# k3s kubectl get nodes
NAME  STATUS   ROLES                       AGE    VERSION
VM1   Ready    control-plane,etcd,master   130m   v1.22.7+k3s1
VM2   Ready    control-plane,etcd,master   128m   v1.22.7+k3s1

Installing cert-manager works fine:

# k3s kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
# k3s kubectl get pods --namespace cert-manager
NAME                                       READY   STATUS
cert-manager-b4d6fd99b-c6fpc               1/1     Running
cert-manager-cainjector-74bfccdfdf-gtmrd   1/1     Running
cert-manager-webhook-65b766b5f8-brb76      1/1     Running

My manifest has the following definition:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: info@example.org
    privateKeySecretRef:
      name: letsencrypt-account-key
    solvers:
    - selector: {}
      http01:
        ingress: {}

Which results in the following error:

# k3s kubectl apply -f manifest.yaml
Error from server (InternalError): error when creating "manifest.yaml": Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s": context deadline exceeded

I tried disabling both firewalls, waiting a day, reset and re-setup, but the error persists. Google hasn't been much help either. The little info I can find goes over my head for the most part and no tutorial seems to do any extra steps.

Steffen
  • 1,328
  • 3
  • 12
  • 29
  • Did you ever find an answer to this? – Koman Jun 09 '22 at 12:52
  • @Koman Unfortunately no. I switched to managed certificates for now and am no longer debugging it actively. I still like to know what I did wrong since I'm going to need it eventually. – Steffen Jun 10 '22 at 16:08
  • From what you've told us, you have a 2 nodes cluster, with no SDN. Check https://projectcalico.docs.tigera.io/getting-started/kubernetes/k3s/multi-node-install . Also: while I'm not familiar with k3s, having 2 control plane nodes sounds weird, are they both hosting etcd? If so: you should go with 3 nodes. A 2 nodes etcd cluster would not tolerate any failure. – SYN Jul 31 '22 at 11:12

5 Answers5

1

A good starting point for troubleshooting issues with the webhook can be found int the docs, e.g. there is a section for problems on GKE private clusters.

In my case, however, this didn't really solve the problem. For me the issue was that when I played around with cert-manager I happen to install and uninstall it multiple times. It turned out that just removing the namespace, e.g. kubectl delete namespace cert-manager didn't remove the webhooks and other non-obvious resources.

Following the official guide for uninstalling cert-manager and applying the manifests again solved the issue.

bentocin
  • 441
  • 6
  • 13
0

Try to specify the proper ingress class name in your Cluster Issuer, like this:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: info@example.org
    privateKeySecretRef:
      name: letsencrypt-account-key
    solvers:
    - http01:
        ingress:
          class: nginx

Also, make sure that you have the cert manager annotation and the tls secret name specified in your Ingress like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt
...
spec:
  tls:
    - hosts:
      - domain.com
      secretName: letsencrypt-account-key
Yurii Kuzemko
  • 671
  • 5
  • 10
0

There are 2 ways to make the installation and these 2 ways are to be deleted/uninstalled the same way they were installed.

If you installed using this link: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml

Delete using this: kubectl delete -f https://github.com/cert-manager/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml

If you installed using this link: kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml

Delete using this: kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/vX.Y.Z/cert-manager.yaml

If you are experiencing terminating state, run: kubectl delete apiservice v1beta1.webhook.cert-manager.io

source: https://cert-manager.io/v1.2-docs/installation/uninstall/kubernetes/

Personally, I go as far as checking the node tied to the certificate and recycle or delete it because I have set the pool to autoscale.

Then, follow the instruction on this page to assist you make a successful installation: https://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on-digitalocean-kubernetes

  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jun 23 '23 at 13:00
0

I had the same problem while installing Eclipse Che on minikube. I just reran the command and this time the certs worked!

Worth trying once.

ArunKolhapur
  • 5,805
  • 4
  • 18
  • 31
-3

I do this it, and work for me.

helm install
cert-manager jetstack/cert-manager
--namespace cert-manager
--create-namespace
--version v1.8.0
--set webhook.securePort=10260

source: https://hackmd.io/@maelvls/debug-cert-manager-webhook

Bob
  • 71
  • 4
  • A link to a solution is welcome, but please ensure your answer is useful without it: [add context around the link](//meta.stackexchange.com/a/8259) so your fellow users will have some idea what it is and why it is there, then quote the most relevant part of the page you are linking to in case the target page is unavailable. [Answers that are little more than a link may be deleted.](/help/deleted-answers) – jmoerdyk Aug 08 '22 at 22:57