7

We have an application which is deployed on Tomcat 8 application server and currently monitoring server (Zabbix) is configured to generate alert if the heap memory is 90% utilized.

There were certain alerts generated which prompted us to do heap dump analysis. Nothing really came out of heap dump, there was no memory leak. There were lot of unreachable object which were not cleaned up because of no GC.

JVM configurations:

-Xms8192m -Xmx8192m -XX:PermSize=128M -XX:MaxPermSize=256m 
-XX:+UseParallelGC -XX:NewRatio=3 -XX:+PrintGCDetails 
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/app/apache-tomcat-8.0.33 
-XX:ParallelGCThreads=2 
-Xloggc:/app/apache-tomcat-8.0.33/logs/gc.log 
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps -XX:GCLogFileSize=50m -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=30

We tried running garbage collection manually using jcmd command and it cleared up the memory. GC logs after running jcmd:

2016-11-04T03:06:31.751-0400: 1974627.198: [Full GC (System.gc()) [PSYoungGen: 18528K->0K(2049024K)] [ParOldGen: 5750601K->25745K(6291456K)] 5769129K->25745K(8340480K), [Metaspace: 21786K->21592K(1069056K)], 0.1337369 secs] [Times: user=0.19 sys=0.00, real=0.14 secs]

Questions:

  1. Is there any configuration above due to which GC is not running automatically.
  2. What is the reason of this behavior? I understand that Java will do GC when it needs to. But, if it is not running GC even when heap is 90% utilized, what should be the alert threshold (and if it even makes sense to have any alert based on heap utilization).
Yuri
  • 4,254
  • 1
  • 29
  • 46
Ankit
  • 3,083
  • 7
  • 35
  • 59
  • Just wondering: wouldn't it make sense then to have a some "cron job" thread that just sits there and runs System.gc() once a day or so? Or maybe it checks for utilization and runs system.gc() when you are > 80% ? – GhostCat Nov 04 '16 at 10:20
  • @GhostCat: This can be done, but not recommended. GC can never be forced but only requested. Also the problem statement is not on how to run gc when jvm is not running it automatically. I wish to know why is it not running, what is the threshold it waits for. – Ankit Nov 04 '16 at 10:23
  • There is a reason why I put that up as comment. It is more a suggestion for an experiment. Yes, sometimes one should work hard to understand everything. But sometimes that is costly, and a cheap workaround would do instead. That is all I wanted to say ... – GhostCat Nov 04 '16 at 10:25
  • In general GC is triggered if the VM suffers an allocation failure in any of the heap spaces or a occupancy threshold is reached (in an old generation space). For CMS that threshold is [initally > 90%](https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/cms.html) of tenured heap occupancy. But it really depends on the garbage collector you are using, the VM version and the VM vendor. – Ralf Nov 04 '16 at 10:32
  • @GhostCat: I would have thought about solution if I was trying to solve a problem. 90% heap utilization is not hurting anyway right now. – Ankit Nov 04 '16 at 10:34
  • @Ralf: Garbage collector being used is Parallel GC, is it possible to get to know about the threshold if VM version and vendor is known? – Ankit Nov 04 '16 at 10:36
  • The parallel collector can fill up the old gen to nearly 100% before starting a full GC, that's not unusual (the concurrent collectors start earlier to avoid STW GCs). What you should be concerned about is the heap capacity immediately *after* a major GC. Which is looking fine (no leak) according to your GC log. – the8472 Nov 04 '16 at 10:39
  • The Parrallel Collector does not use a occupancy threshold as a trigger. Instead it tries to comply with max pause time, throughput, and footprint goals. Check out the [documentation](https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/parallel.html). – Ralf Nov 04 '16 at 10:43
  • [I believe here is answer to your question](http://stackoverflow.com/questions/1582209/java-garbage-collector-when-does-it-collect) – Bero Lomsadze Nov 04 '16 at 10:45

1 Answers1

11

When the garbage collector decides to collect differs per garbage collector. I have not been able to find any hard promises on when your (Parallel GC) garbage collector runs. Many Garbage collectors also tune on several different variables, which can influence when it will run.

As you have noted yourself, your application can have high heap usage and still run fine. What you are looking for in an application is that the Garbage Collector is still efficient. Meaning it can clean up quiet a lot of garbage in a single run.

Some aspects of garbage collection

Most garbage collectors have two or more strategies, one for 'young' objects and one for 'old' objects. When a young object has not been collected in the latest (several) collects, it becomes an old object. The idea behind this is that if an object has not been collected it probably wont be collected next time either. (Most objects either live really short, or really long). The garbage collector does a very efficient, but not perfect cleaning of the young objects. When that doesn't free up enough data, then a more costly garbage collection is done on all (young en old) objects.

This will often generate a saw tooth (taken from this site): Heap size over time Here you see many small drops in heap size and a slowly growing heap. Every now and then a large collection is done, and there is a large drop. The actually 'used' memory is the amount of memory left after a large collection.

Aspects to measure

This leads to the following aspects you can look at when determining the health of your application:

  1. The amount of time spent by you application garbage collecting (both in total and as a percentage of CPU time).
  2. The amount of memory available right after a garbage collect.
  3. The a quick increment in the number of large garbage collects.

In most cases you will need monitor the behavior of your application under load, to see what are good values for you.

The parallel garbage collector uses a similar condition to determine if all is still well:

If more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, then an OutOfMemoryError is thrown.

All of these statistic you can see nicely using VisualVM and Jconsole. I am not sure which you can use as triggers in your monitoring tools

Thirler
  • 20,239
  • 14
  • 63
  • 92