There is a spring boot application deployed on Kubernetes using Helm which serves approximate 6K TPS traffic.
App uses RollingUpdate
strategy with 6 replicas and the maxUnavailable
and maxSurge
is set to 25%.
When we upgrade the application version using helm upgrade
which has ongoing traffic, we see ~0.5% of total traffic failure during that upgrade window.
My question is , Should we expect 0% failure ? If yes, are there some best practices that one should follow while deploying the app to handle requests during upgrade/rollback ?