java GC CMS-initial-mark slows queries on highload tomcat production server

Question

I have java tomcat application that is serving web requests. It should execute each web request as fast as possible. Every second I receive about 4 requests. I've done a serious job optimizing the servlet. Now it generally serves each request in less then 100ms. But sometimes requests are served after 200ms. If I rerun these slow requests, they are again faster then 100ms. I've enabled GC logging and found out that almost always before each "slow" request I have CMS-initial-mark that freezes JVM:

[GC [1 CMS-initial-mark: 1411302K(1415616K)] 1912881K(2029056K), 0.3224110 secs] 
[Times: user=0.00 sys=0.00, real=0.32 secs]

And mostly all the time request execution time is the same order of magnitude as GC collection time. I use 8-core server with 32GB of RAM. My JVM settings are:

-Xmx2G -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails.

My OS is latest Debian with default settings. No other processes are running except mongodb. The only process that is shown in top command output is java and it uses ~100% of CPU (I have 8 CPUs, so it is only 1/8 of load).

How can I overcome this problem? What else can be diagnosed?

score 0 · Answer 1 · answered May 08 '14 at 20:52

That is a surprisingly long amount of time for the initial mark. It suggest something odd is happening. It could be method which is too large to compile (as it is more than 8 KB)

You could be having a large number of tenured objects pointing to short lived objects but this is unlikely to cause such a long delay. This would place a load on all your CPUs.

What is significant is you appear to be tying up one CPU which suggests a very sepcific problem (like one oversized method)

I would look at increasing the eden size to reduce the amount of premature promotions. This might help, or make it worse depending on what you are doing.

In the meantime, I would try the parallel collector. You might find it handles your use case better.

BTW, If you are taking 100 ms, you might gain from reducing the amount of garbage you produce. I would look at your allocations in a memory profiler and try to reduce them.

This is not the Garbage Collection issue obviously: both user and system CPU times are zero. Probably it takes too long to reach a safepoint (may be checked with `-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1`). But more likely it is an issue of the OS, which does not give CPU time to Java process because it is busy with something else. — apangin, May 09 '14 at 00:49

score 0 · Answer 2 · edited May 23 '17 at 11:51

Note user=0.00 sys=0.00 in your logs. This means it is not GC who takes so much time.
CPU is idle, but the application freezes, possibly because of an external activity like I/O or swapping.

You've told mongodb is running. Isn't it the reason of disk I/O or something? I've experienced similar problems in Linux with long stop-the-world pauses during intensive disk operation until I switched to deadline I/O scheduler.

score 0 · Answer 3 · answered Jun 06 '14 at 01:54

If someone is interested, I can tell how I have solved the problem. I've added a bit more memory to -Xmx5G. Then I've realized that queries that were made to mongodb had an extra java layer morphia, which is doing extra queries for each nested document with @Reference. I've rewritten bottleneck mongodb requests and the problem has gone. Thanx for your ideas, anyway.

java GC CMS-initial-mark slows queries on highload tomcat production server

3 Answers3