We are running a JDK17 spring-boot application on our production server with following configuration:
- JDK Vendor : Amazon corretto (17.0.6)
- K8S version : 1.17
- Max pod memory : 5GB
- Min pod memory : 5GB
- Xmx : 2GB
- Xms : 2GB
The problem we are running into is that in every 24hr, the application is getting OOM Killed by K8S (exit code 137). Few observations till now:
- There is no leak in heap memory, it's confired from multiple heap dumps and gc logs as well.
- No leak is observed in native memory as well with native memory dump. The max reserved seen in native memory dump is 3.5GB. Dump has been taken in periodic intervals, no increase in any non-heap area is observed.
- We have checked that RSS size of the application increases gradually to 5GB when the OOM kill happens.
We have tried to tweak around XMX/XMS and few other GC parameters (disabling adaptive IHOP and all), but nothing helped till now. Possibly there is some leak and same is visible in growing RSS, but same is not reflecting on native dump.