Kubernetes Version Upgrades and Downtime

Question

I just tested Ranche RKE , upgrading kubernetes 13.xx to 14.xx , during upgrade , an already running nginx Pod got restarted during upgrade. Is this expected behavior?

Can we have Kubernetes cluster upgrades without user pods restarting?

Which tool supports un-intruppted upgrades?

What are the downtimes that we can never aviod? ( apart from Control plane )

score 4 · Answer 1 · answered Jul 17 '19 at 17:26

The default way Kubernetes upgrades is by doing a rolling upgrade of the nodes, one at a time.

This works by draining and cordoning (marking the node as unavailable for new deployments) each node that is being upgraded so that there no pods running on that node.

It does that by creating a new revision of the existing pods on another node (if it's available) and when the new pod starts running (and answering to the readiness/health probes), it stops and remove the old pod (sending SIGTERM to each pod container) on the node that was being upgraded.

The amount of time Kubernetes waits for the pod to graceful shutdown, is controlled by the terminationGracePeriodSeconds on the pod spec, if the pod takes longer than that, they are killed with SIGKILL.

The point is, to have a graceful Kubernetes upgrade, you need to have enough nodes available, and your pods must have correct liveness and readiness probes (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/).

Some interesting material that is worth a read:

https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime (specific to GKE but has some insights)
https://blog.gruntwork.io/zero-downtime-server-updates-for-your-kubernetes-cluster-902009df5b33

what about deamonsets and statefulset , do u mean that restart of pods can't be avoided? why a running pod need to be restarted or rescheduled on control plane upgrade? ofcourse it ok to restart control plane pods — Ijaz Ahmad, Jul 17 '19 at 18:00
The standalone docker can be configured not to restart existing containers on docker daemon restart. How come kubernetes is worse — Ijaz Ahmad, Jul 17 '19 at 18:02
I don't know exactly how Rancher works, but on GKE if you restart the master of the cluster there is no downtime on the pods since they are not on the master, but on separated node pools, if you do upgrade a node pool, you *may* have downtime. You should not have any downtime even if it does that, in case you have configured your resources correctly. Of course, this all depends if you have a sufficient amount of node pools that can ensure your application keeps working in case one your nodes are out of operation, which is the case for an upgrade. — jonathancardoso, Jul 18 '19 at 01:11
A common solution if you are running a single node, is to add a new one before you do the upgrade, and remove it after. — jonathancardoso, Jul 18 '19 at 01:12

score 0 · Accepted Answer · answered Jan 08 '20 at 10:43

0

Resolved By configuring the container runtime on the hosts not to restart containers on docker restart.

answered Jan 08 '20 at 10:43

Ijaz Ahmad

11,198
9
53
73

Kubernetes Version Upgrades and Downtime

2 Answers2