I am working with a team developing a Java GUI application running on a 1GB Linux target system.
We have a problem where the memory used by our java process grows indefinitely, until Linux finally kills the java process.
Our heap memory is healthy and stable. (we have profiled our heap extensively) We also used MemoryMXBean to monitor the application's non heap memory usage, since we believed the problem might lie there. However, what we see is that reported heap size + reported non heap size stays stable.
Here is an example of how the numbers might look when running the application on our target system with 1GB RAM (heap and non heap reported by MemoryMXBean, total memory used by Java process monitored using Linux's top command (resident memory)):
At startup:
- 200 MB heap committed
- 40 MB non heap committed
- 320 MB used by java process
After 1 day:
- 200 MB heap committed
- 40 MB non heap committed
- 360 MB used by java process
After 2 days:
- 200 MB heap committed
- 40 MB non heap committed
- 400 MB used by java process
The numbers above are just a "cleaner" representation of how our system performs, but they are fairly accurate and close to reality. As you can see, the trend is clear. After a couple of weeks running the application, the Linux system starts having problems due to running out of system memory. Things start slowing down. After a few more hours the Java process is killed.
After months of profiling and trying to make sense of this, we are still at a loss. I feel it is hard to find information about this problem as most discussions end up explaining the heap or other non heap memory pools. (like Metaspace etc.)
My questions are as follows:
If you break it down, what does the memory used by a java process include? (in addition to the heap and non heap memory pools)
Which other potential sources are there for memory leaks? (native code? JVM overhead?) Which ones are, in general, the most likely culprits?
How can one monitor / profile this memory? Everything outside the heap + non heap is currently somewhat of a black box for us.
Any help would be greatly appreciated.