2

I'm using Embedded Tomcat 9.0.68 in a Spring Boot app. In front I have a load-balancer (not HTTP aware).

Unfortunately, when scaling down the app I see some errors despite of the fact that the app shuts down cleanly (I'm using server.shutdown=graceful in Spring Boot so it shuts down the Tomcat first before shutting down beans etc.). I've done some testing and it unequivocally shows that if a HTTP client has established a keep-alive connection to Tomcat, when the Tomcat is shutdown (thrugh a singal from Spring) it will immediately close all keep alive connections. This means there's an inherent race condition, because the user at the other end might already have sent a request down the line. I would have hoped graceful shutdown worked as follows - from the operating environment perspective:

  1. User informs operating envionment of scale down
  2. The operating environment stops sending new connections to the scaled down instances. The operating environment can't do anything about already established connections
  3. The operating environment sends SIGTERM to the scaled down instances and awaits termination.

Tomcat (app) upon receiving SIGTERM:

  1. Immediately stops accepting new connections
  2. Wheneven a response is sent back (on connections established before SIGTERM), the response will have "Connection: close" allowing Tomcat to immediately close that connection
  3. If/when Tomcat gets rid of all connections this way it can shutdown. However, this leaves the potential for keep-alive connections to be open if there's no activity (meaning Tomcat can't send the Connection: close back and close it). However, once the configured keepAliveTimeout period (which the client is informed of in every HTTP response via the keepAlive header) has passed, then Tomcat can legally shut down all keep-alived connections that happen to still be open since the client is no longer permitted to use them anyway. After that, it can then inform Spring Boot it has shutdown, which will then continue the shutdown and ultimately the operating environment (Kubernetes)

In this way the shutdown would be completely clean in all circumstances, and the shutdown time would be bound by keepAliveTimeout. Is there any ways I can enforce this behaviour? Or simulate it somehow?

mv123
  • 21
  • 1

0 Answers0