The default way Kubernetes upgrades is by doing a rolling upgrade of the nodes, one at a time.
This works by draining and cordoning (marking the node as unavailable for new deployments) each node that is being upgraded so that there no pods running on that node.
It does that by creating a new revision of the existing pods on another node (if it's available) and when the new pod starts running (and answering to the readiness/health probes), it stops and remove the old pod (sending SIGTERM
to each pod container) on the node that was being upgraded.
The amount of time Kubernetes waits for the pod to graceful shutdown, is controlled by the terminationGracePeriodSeconds
on the pod spec, if the pod takes longer than that, they are killed with SIGKILL
.
The point is, to have a graceful Kubernetes upgrade, you need to have enough nodes available, and your pods must have correct liveness and readiness probes (https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/).
Some interesting material that is worth a read:
https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime (specific to GKE but has some insights)
https://blog.gruntwork.io/zero-downtime-server-updates-for-your-kubernetes-cluster-902009df5b33