We are facing a strange issue in our Grails application. The total memory consumption shoots up in a very short period. It runs well for a long time, but at certain point, the memory consumption just rises until there is no more memory to consume(Xmx). It may rise from 4 GB consumption to 20 GB within 5 minutes. Once the complete memory has been consumed, tomcat becomes unresponsive. It does not heel itself even if it is left alone. At some point, I would expect an OutOfMemory exception. but that never happens even if we leave the application untouched. I can see that the garbage collector keeps on running(we are using New relic), but still the consumed memory does not go down.
When we noticed that garbage collection is happening in one big blow, we changed garbage collector to G1GC from ConcurrentMarkSweep. But has not helped as well. For analysing the non-responsive jvm, we took a thread dump and found that there were no Deadlocks and also a lot of our threads were in a “BLOCKED” state:
Thread 6632: (state = BLOCKED)
- sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
- java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame)
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame)
- java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=442 (Compiled frame)
- org.apache.tomcat.util.threads.TaskQueue.take() @bci=36, line=104 (Interpreted frame)
- org.apache.tomcat.util.threads.TaskQueue.take() @bci=1, line=32 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor.getTask() @bci=156, line=1068 (Interpreted frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1130 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=724 (Interpreted frame)
Only a few of these threads were in a “IN_NATIVE” state.
We even took a memory dump using jmap, The top consuming classes look like this:
We are using Grails 2.3.8 with MongoDB Plugin 3.0.1 for GORM. We are using Memcached for session sharing on tomcat7 and Redis for caching(spring-cache as well as native redis).
Our server details are:
Server version: Apache Tomcat/7.0.35
Server built: May 24 2013 09:52:20
Server number: 7.0.35.0
OS Name: Linux
OS Version: 3.8.0-19-generic
Architecture: amd64
JVM Version: 1.7.0_25-b15
JVM Vendor: Oracle Corporation
We are really out of ideas on how to fix this issue. Looking for help/pointers in resolving this issue.