i would use :
- traefik to do load balancing
- docker swarm / kubernetes to handle deployments and rolling updates of your project
- have a well written nodejs code.
i dont know particularly wss, but the concept is to increment a work count on the server when people ask for work and decrement it when the server responds.
then you can confidently exit the server as soon a the count reaches 0.
when you want to close the server, just stop accepting connections (and work) and exit server as soon as the count reaches zero.
the load balancer needs to stop routing requests to the node you want to stop before you start stopping it. you can use healthchecks for that.
When you configure zero downtime perfectly AND you have a perfect network you dont need to do the following, but i would also add a "retry on failure" on the client side.