26

What we have :

  • Java app running in docker container
  • No native code called, no processed launched, no DLLs / .so files referenced.
  • JVM args : -Xmx256m -XX:NativeMemoryTracking=summary
  • Docker hard memory limit set to 768m
  • JVM seems normal (normal GC cycles, no memory leaks, no OOMs)
  • Docker memory keeps growing until it reaches hard limit (768m) resulting in killing and restarting the container.

The Issue :

  • Why does the docker stats memory keep on growing (resulting in the hard memory limit being hit every day) despite the fact that the JVM seems to be behaving within its limits.

enter image description here

  • With other microservices we don't see this behavior

enter image description here

JVM

On the JVM side we don't notice anything special :

enter image description here

Docker stats output :

492.8MiB / 768MiB     64.17%              

[ec2-user@ip-10-180-28-222 ~]$ docker exec 34d7 jcmd 1 VM.native_memory summary
1:

Native Memory Tracking:

Total: reserved=1731355KB, committed=472227KB
-                 Java Heap (reserved=262144KB, committed=262144KB)
                            (mmap: reserved=262144KB, committed=262144KB)

-                     Class (reserved=1131805KB, committed=92829KB)
                            (classes #16224)
                            (malloc=7453KB #20996)
                            (mmap: reserved=1124352KB, committed=85376KB)

-                    Thread (reserved=29932KB, committed=29932KB)
                            (thread #30)
                            (stack: reserved=29772KB, committed=29772KB)
                            (malloc=94KB #151)
                            (arena=66KB #55)

-                      Code (reserved=255659KB, committed=35507KB)
                            (malloc=6059KB #9814)
                            (mmap: reserved=249600KB, committed=29448KB)

-                        GC (reserved=15369KB, committed=15369KB)
                            (malloc=5785KB #547)
                            (mmap: reserved=9584KB, committed=9584KB)

-                  Compiler (reserved=190KB, committed=190KB)
                            (malloc=59KB #858)
                            (arena=131KB #6)

-                  Internal (reserved=7849KB, committed=7849KB)
                            (malloc=7817KB #18468)
                            (mmap: reserved=32KB, committed=32KB)

-                    Symbol (reserved=20018KB, committed=20018KB)
                            (malloc=17325KB #175818)
                            (arena=2693KB #1)

-    Native Memory Tracking (reserved=3558KB, committed=3558KB)
                            (malloc=10KB #120)
                            (tracking overhead=3548KB)

-               Arena Chunk (reserved=4830KB, committed=4830KB)
                            (malloc=4830KB)

After running for about 20 hours

649.6MiB / 768MiB     84.59%               

[ec2-user@ip-10-180-28-222 ~]$ docker exec 34d7 jcmd 1 VM.native_memory summary
1:

Native Memory Tracking:

Total: reserved=1741020KB, committed=510928KB
-                 Java Heap (reserved=262144KB, committed=262144KB)
                            (mmap: reserved=262144KB, committed=262144KB)

-                     Class (reserved=1138319KB, committed=100495KB)
                            (classes #16390)
                            (malloc=7823KB #30851)
                            (mmap: reserved=1130496KB, committed=92672KB)

-                    Thread (reserved=30996KB, committed=30996KB)
                            (thread #31)
                            (stack: reserved=30800KB, committed=30800KB)
                            (malloc=97KB #156)
                            (arena=99KB #57)

-                      Code (reserved=261330KB, committed=69062KB)
                            (malloc=11730KB #16047)
                            (mmap: reserved=249600KB, committed=57332KB)

-                        GC (reserved=15363KB, committed=15363KB)
                            (malloc=5779KB #334)
                            (mmap: reserved=9584KB, committed=9584KB)

-                  Compiler (reserved=223KB, committed=223KB)
                            (malloc=92KB #1246)
                            (arena=131KB #6)

-                  Internal (reserved=8358KB, committed=8358KB)
                            (malloc=8326KB #18561)
                            (mmap: reserved=32KB, committed=32KB)

-                    Symbol (reserved=20253KB, committed=20253KB)
                            (malloc=17527KB #177997)
                            (arena=2725KB #1)

-    Native Memory Tracking (reserved=3846KB, committed=3846KB)
                            (malloc=10KB #127)
                            (tracking overhead=3836KB)

-               Arena Chunk (reserved=188KB, committed=188KB)
                            (malloc=188KB)

Observations

What we know so far after observation for 20 hours :

  • Docker stats jumped from 492.8MiB to 649.6MiB
  • JVM committed native memory jumped from 472227KB to 510928KB
  • The gap between the docker stats and JVM committed native memory seems to be growing. (where is the 649.6MiB - 510928KB and why is it growing)
  • JVM stats remain normal throughout this period.

So I don't know what else I can debug on the JVM side. I understand java needs more than just the heap (hence the native memory tracking), but there is still a gap of around 150m between what the jvm native memory tracker reports and what the docker stats is seeing. How can I gain some more insights in where that memory is going ?

ddewaele
  • 22,363
  • 10
  • 69
  • 82
  • How does your dockerfile look like? – KeyMaker00 Oct 02 '18 at 12:07
  • https://github.com/IxorTalk/ixortalk-authserver/blob/master/src/main/docker/Dockerfile The `JAVA_OPTS` we pass are set by cloudformation templates but the only thing we pass is `-Xmx256m` (and recently the `-XX:NativeMemoryTracking=summary` for debugging – ddewaele Oct 02 '18 at 12:12
  • you might want to cross-post this question to DevOpsSE – Open Food Broker Dec 14 '18 at 14:04
  • 1
    I think this answer https://stackoverflow.com/a/38670876/482811 should help you understand the discrepancies and maybe solve your problem. And note that the first metrics you are checking are the internal Java heap, which is a (usually big) fraction of the total memory a Java application is using. So indeed the Java heap size might always be below your limit, it does not tell you much about the other Java memory usage though: GC, stacks, memory mapped outside the heap, etc. – Huygens Jan 25 '19 at 15:48
  • Next to native memory, you also might want to run jcmd with the following flags: VM.stringtable / VM.symboltable. – Pieterjan Deconinck Nov 25 '19 at 20:11

2 Answers2

5

JVM reported memory is not everything.

JVM’s memory footprint

So, what contributes to the JVM memory footprint? Most of us who have run a Java application, know how to set the maximum heap space. But there’s actually a lot more contributing to the memory footprint:

  • Native JRE
  • Perm / metaspace
  • JIT bytecode
  • JNI
  • NIO
  • Threads

This is a lot that needs to be kept in mind when we want to set memory limits with Docker containers. And also setting the container memory limit to the maximum heap space, might not be sufficient…

JVM and CPUs

Let’s take a short look at how the JVM adjusts to the number of processors/cores available on the node it is running on. There are actually a number of parameters which, by default are initialised based on core count.

  • ♯ of JIT compiler threads
  • ♯ Garbage Collection threads
  • ♯ of thread in the common fork-join pool …

So if the JVM is running on a 32 core node (and one did not overwrite the default), the JVM will spawn 32 Garbage Collection threads, 32 JIT compiler threads, …. source

To prevent this problem, you should use +UseContainerSupport (available since 8u191 enabled by default in Java 10) and probably -XX:MaxRAMPercentage=90.0 or less depending on your observed total used memory. For more see this and that.

I strongly recommend: "Nobody puts Java in the container: Ken Sipe" from JavaZone

fahim
  • 20
  • 1
  • 4
janisz
  • 6,292
  • 4
  • 37
  • 70
1

A. Please read janisz answer carefully and follow the links, its a lot of very important information for people who are using Java in containers or otherwise under cgroups.

B. The main issue is that the JVM doesn't see the memory limit on the container: it believes it has the entire free memory of the host OS to play with. When it tries to consume more memory than the cgroup limit allows, the kernel/docker kills the container for violating the cgroup memory limit promise. This is what -XX:+UseContainerSupport and the older -XX:+UseCGroupMemoryLimitForHeap flags are supposed to fix: let the JVM know what the true limits are.

Additional Info

The -Xmx flag does not limit all the memory the JVM - as a Linux process - requires from the operating system. All the machniations of the JVM itself, including the Java stack, metaspace, loaded code, etc (as discussed in janisz' answer) does take memory space as well.

Unfortunately, the JVM likes to take as much memory from the operating system as it needs and prefers to take more memory (if it thinks it is available) then reusing existing (and possibly releasable) memory. Improvements to this behavior (i.e. not assuming the JVM is the only actor in the system) are planned as part of the new G1 garbage collector that ships with Java 12, but unless you have that working for you, the JVM will always grow in its memory usage and will prefer to consume all available free memory, assuming that the sole purpose of the operating system is to provide services for this one JVM it is running.

Guss
  • 30,470
  • 17
  • 104
  • 128