4

I want to scale my deployment depending on the amount of requests. Each pod can only handle a request at a time. Scaling up is no problem, but when I want to scale down I want to make sure I am not killing a pod that is working right now ( e.g. encoding a large file).

I have the folling pods:

  • Pod 1 (created 10 min ago, has a task)
  • Pod 2 (created 5 min ago, is free)
  • Pod 3 (created 1 min ago, has a task)

If I reduce the replica value, kubernetes will kill pod 3. It does not care if the pod is busy or not. I could manually kill pod 2, so kubernetes would start a new one:

  • Pod 1 (created 10 min ago, has a task)
  • Pod 3 (created 1 min ago, has a task)
  • Pod 4 (created just now, is free)

After I know pod 2 got killed I could reduce the number of the counter, so pod 4 will be killed before getting a task. But this solution sounds very ugly, because someone else has to tell pod 2 to shut down.

So kubernetes will kill the last created ones, but is it possible to tell him, that a pod is busy and he has to wait before it will be killed?

cre8
  • 13,012
  • 8
  • 37
  • 61
  • When I shut down the pod, e.g. pod 2 out of 4, kubernetes will start another one because of the replicas counter. And when I lower the value by one kubernetes will kill pod 4, not pod 2. – cre8 Aug 11 '17 at 15:52
  • 1
    @GManNickG updated my question with a more detailed example – cre8 Aug 11 '17 at 15:59
  • 1
    Looks like https://github.com/kubernetes/kubernetes/issues/4301 is relevant to your concern. `Less` is the function that orders Pods for removal: https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/controller_utils.go#L689. Sounds like there was discussion of having a more proactive approach to this, where Pods can specify how costly it is to terminate them. The work around until then is to force the pod you want to be removed to be ordered for removal first. To also do this while ensuring it receives no traffic probably requires some distributed semaphore or something. – GManNickG Aug 11 '17 at 16:08
  • Thank you for the detailed research @GManNickG. I will work on an work around to deal with the situation until Kubernetes offers a better solution. – cre8 Aug 11 '17 at 16:16
  • Yup. Keep in mind pod termination is still subject to the usual rules, so the cheapest workaround would be to set the termination grace period to a long enough time for your requests to complete. This has other effects on the system, though. (E.g., pod isn't handling a request but takes forever to shutdown anyway.) – GManNickG Aug 11 '17 at 16:19
  • So each pod is counting down, like a timeout. After the value is reached he sends a request to a "master" with "pls kill me, i am waiting so long". The master will kill him and reduce the number of replicas. – cre8 Aug 11 '17 at 16:39
  • Not quite: see https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods. Once a pod is marked deleted, *that's* when its graceful timeout starts. When that timeout ends, it is forcefully killed. The pod actually has no say in the matter. – GManNickG Aug 11 '17 at 17:17
  • Check [this answer](https://stackoverflow.com/a/63203913/2147383) for a possible solution. – Juniper Aug 01 '20 at 11:12

0 Answers0