4

I am thinking about paritioning my Kubernetes cluster into zones of dedicated nodes for exclusive use by dedicated sets of users as discussed here. I am wondering how tainting nodes would affect DaemonSets, including those that are vital to cluster operation (e.g. kube-proxy, kube-flannel-ds-amd64)?

The documentation says daemon pods respect taints and tolerations. But if so, how can the system schedule e.g. kube-proxy pods on nodes tainted with kubectl taint nodes node-x zone=zone-y:NoSchedule when the pod (which is not under my control but owned by Kubernetes' own DaemonSet kube-proxy) does not carry a corresponding toleration.

What I have found empirically so far is that Kubernetes 1.14 reschedules a kube-proxy pod regardless (after I have deleted it on the tainted node-x), which seems to contradict the documentation. One the other hand, this does not seem to be the case for my own DaemonSet. When I kill its pod on node-x it only gets rescheduled after I remove the node's taint (or presumably after I add a toleration to the pod's spec inside the DaemonSet).

So how do DaemonSets and tolerations interoperate in detail. Could it be that certain DaemonSets (such as kube-proxy, kube-flannel-ds-amd64) are treated specially?

rookie099
  • 2,201
  • 2
  • 26
  • 52

2 Answers2

6

Your kube-proxy and flannel daemonsets will have many tolerations defined in their manifest that mean they will get scheduled even on tainted nodes.

Here are a couple from my canal daemonset:

tolerations:
  - effect: NoSchedule
    operator: Exists
  - key: CriticalAddonsOnly
    operator: Exists
  - effect: NoExecute
    operator: Exists

Here are the taints from one of my master nodes:

taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/controlplane
    value: "true"
  - effect: NoExecute
    key: node-role.kubernetes.io/etcd
    value: "true"

Even though most workloads won't be scheduled on the master because of its NoSchedule and NoExectue taints, a canal pod will be run there because the daemonset tolerates those taints specifically.

The doc you already linked to goes into detail.

switchboard.op
  • 1,878
  • 7
  • 26
3

I had the same issue. It was necessary for my daemonset to run its pods on every nodes (critical pod definition). I had this daemonset tolerations definition:

spec:
  template:
    spec:
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists

And it was running on the only node with no taint definition...

I've checked on my kube-proxy which was just a line different:

spec:
  template:
    spec:
      tolerations:
      - key: CriticalAddonsOnly
        operator: Exists
      - operator: Exists

So I added this "- operator: Exists" line (I do not really understand what it does and how) to my daemonset definition and now it works fine. My daemonset starts pods on every nodes of my cluster...

Jared Forth
  • 1,577
  • 6
  • 17
  • 32
trollnba
  • 39
  • 1
  • 6
    "An empty key with operator Exists matches all keys, values and effects which means this will tolerate everything." from https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/ – fredrik Dec 02 '20 at 08:43