0

I have multiple pods on Kubernetes (v1.23.5) that are not being evicted and rescheduled in case of node failure.

According to Kubernetes documentation, this process must begin after 300s:

Kubernetes automatically adds a Toleration for node.kubernetes.io/not-ready and node.kubernetes.io/unreachable with tolerationSeconds=300 unless you, or a controller, set those tolerations explicitly. These automatically-added tolerations mean that Pods remain bound to Nodes for 5 minutes after detecting one of these problems.

Unfortunately, pods get stuck in terminating status and would not evict. However, in one test on a pod without any PVC attached, it evicted and started running on another node.

  • I'm trying to understand how I can make other pods evict after the default 300s time.
  • I don't know why it would not happen automatically, and I must drain the pod stuck in a terminating state to make it work properly.

Update

I have seen the kvaps/kube-fencing project. There seems to be a fencing procedure that runs in case of a node failure. I couldn't make it solve my problem, and I didn't. I don't know whether it is because of my lack of comprehension of this project, or it is solely used to handle the node in case of a failure and not the pods stuck in termination state and evicting those pods.

1 Answers1

0

There are two ways to handle this problem.

First one is to use kvaps/kube-fencing. You need to configure a PodTemplate in which you can set a node to be deleted from the cluster when it becomes NotReady or flush the node. If you have volumes attached to the pod, the pods will remain in the ContainerCreating state.

There are the annotations in the PodTemplate:

annotations:
  fencing/mode: 'delete'
  fencing/mode: 'flush'

The second way is to use the Kubernetes Non-Graceful Node Shutdown. This is not available in Kubernetes (v1.23.5) and you have to upgrade.