This idea uses the ELB's capability to detect an unhealthy node and remove it from the pool BUT it relies upon the ELB behaving as expected in the assumptions below. This is something I've been meaning to test for myself but haven't had the time yet. I'll update the answer when I do.
Process Overview
The following logic could be wrapped and run at the time the node needs to be shut down.
- Block new HTTP connections to nodeX but continue to allow existing connections
- Wait for existing connections to drain, either by monitoring existing connections to your application or by allowing a "safe" amount of time.
- Initiate a shutdown on the nodeX EC2 instance using the EC2 API directly or Abstracted scripts.
"safe" according to your application, which may not be possible to determine for some applications.
Assumptions that need to be tested
We know that ELB removes unhealthy instances from it's pool I would expect this to be graceful, so that:
- A new connection to a recently closed port will be gracefully redirected to the next node in the pool
- When a node is marked Bad, the already established connections to that node are unaffected.
possible test cases:
- Fire HTTP connections at ELB (E.g. from a curl script) logging the
results during scripted opening an closing of one of the nodes
HTTP ports. You would need to experiment to find an
acceptable amount of time that allows ELB to always determine a state
change.
- Maintain a long HTTP session, (E.g. file download) while blocking new
HTTP connections, the long session should hopefully continue.
1. How to block HTTP Connections
Use a local firewall on nodeX to block new sessions but continue to allow established sessions.
For example IP tables:
iptables -A INPUT -j DROP -p tcp --syn --destination-port <web service port>