3

I have a java program that downloads information and parses it, storing results in a MySQL database. It can be started multiple times, up to 3 at the same time.

I was finding that the server load was going sky high (> 10) at certain points and while I still need to tune my MySQL server, I found that the java programs were halting or GC a lot and the server was grinding to an absolute stop.

After some reading of GC configuration articles, I'm using the following parameters with the intention of reducing the number of full GC sweeps that take place (or at least stop them halting the program):

-d64 
-Xms256m 
-Xmx256m 
-XX:PermSize=128m 
-XX:MaxPermSize=128m 
-server 
-XX:+PrintGCTimeStamps 
-Xloggc:/log/gc_$1_$2.log 
-XX:+UseConcMarkSweepGC 
-verbose:gc 
-XX:-UseAdaptiveSizePolicy

However, now I'm seeing almost constant Full GC on each run. Here's an excerpt of the log:

1057.920: [Full GC 30851K->29270K(258880K), 0.1528210 secs]
1058.965: [Full GC 30303K->29297K(258880K), 0.2060580 secs]
1062.652: [Full GC 29839K->29353K(258880K), 0.2875420 secs]
1063.639: [Full GC 30894K->29356K(258880K), 0.1839510 secs]
1070.604: [Full GC 33956K->29262K(258880K), 0.3445180 secs]
1072.146: [Full GC 29770K->29262K(258880K), 0.2201660 secs]
1074.600: [Full GC 30295K->29321K(258880K), 0.1319390 secs]
1081.234: [Full GC 31373K->29293K(258880K), 0.1931140 secs]
1086.498: [Full GC 30591K->29347K(258880K), 0.2624270 secs]

There's a Full GC pretty much every second. Have I chosen the wrong parameters or is there something else I can use to stop this?

Update:

Here's the first 10mins of a run's gc log output with -XX:+PrintGCDetails. I also removed -XX:-UseAdaptiveSizePolicy

4.794: [GC 5.367: [ParNew: 34112K->4224K(38336K), 0.0814000 secs] 34112K->8605K(520064K), 0.0816920 secs] [Times: user=0.04 sys=0.03, real=0.65 secs] 
5.843: [GC 5.843: [ParNew: 38336K->4224K(38336K), 0.0371760 secs] 42717K->15987K(520064K), 0.0372770 secs] [Times: user=0.02 sys=0.01, real=0.04 secs] 
6.153: [GC 6.153: [ParNew: 38336K->4224K(38336K), 0.0333900 secs] 50099K->24554K(520064K), 0.0334970 secs] [Times: user=0.02 sys=0.02, real=0.04 secs] 
15.668: [GC 15.668: [ParNew: 38336K->4224K(38336K), 0.0471250 secs] 58666K->30766K(520064K), 0.0472230 secs] [Times: user=0.06 sys=0.00, real=0.05 secs] 
220.547: [Full GC (System) 220.548: [CMS: 26542K->31455K(481728K), 3.3996650 secs] 57025K->31455K(520064K), [CMS Perm : 20869K->20848K(131072K)], 3.3998940 secs] [Times: user=0.38 sys=0.05, real=3.40 secs] 
448.238: [Full GC (System) 448.264: [CMS: 31455K->31236K(481728K), 2.3479700 secs] 32783K->31236K(520064K), [CMS Perm : 20898K->20895K(131072K)], 2.5008180 secs] [Times: user=0.30 sys=0.03, real=2.53 secs] 
456.217: [Full GC (System) 456.217: [CMS: 31236K->31177K(481728K), 0.2216600 secs] 33856K->31177K(520064K), [CMS Perm : 20912K->20910K(131072K)], 0.2218030 secs] [Times: user=0.21 sys=0.00, real=0.22 secs] 
480.982: [Full GC (System) 480.982: [CMS: 31177K->31251K(481728K), 0.2026140 secs] 32586K->31251K(520064K), [CMS Perm : 20915K->20912K(131072K)], 0.2027550 secs] [Times: user=0.19 sys=0.00, real=0.20 secs] 
483.419: [Full GC (System) 483.419: [CMS: 31251K->31196K(481728K), 0.2436580 secs] 33189K->31196K(520064K), [CMS Perm : 20928K->20926K(131072K)], 0.2438080 secs] [Times: user=0.23 sys=0.00, real=0.24 secs] 
484.583: [Full GC (System) 484.583: [CMS: 31196K->31233K(481728K), 0.2706090 secs] 31842K->31233K(520064K), [CMS Perm : 20980K->20973K(131072K)], 0.2707580 secs] [Times: user=0.25 sys=0.01, real=0.27 secs] 
486.361: [Full GC (System) 486.424: [CMS: 31233K->31235K(481728K), 0.2614690 secs] 32866K->31235K(520064K), [CMS Perm : 20994K->20993K(131072K)], 0.2618060 secs] [Times: user=0.21 sys=0.02, real=0.33 secs] 
521.280: [Full GC (System) 521.280: [CMS: 31235K->31163K(481728K), 0.2190910 secs] 34502K->31163K(520064K), [CMS Perm : 21010K->21009K(131072K)], 0.2192190 secs] [Times: user=0.15 sys=0.00, real=0.22 secs] 
531.169: [Full GC (System) 531.169: [CMS: 31163K->29453K(481728K), 0.2923910 secs] 32492K->29453K(520064K), [CMS Perm : 21015K->20845K(131072K)], 0.2925890 secs] [Times: user=0.27 sys=0.01, real=0.29 secs] 
532.270: [Full GC (System) 532.271: [CMS: 29453K->29384K(481728K), 0.2542390 secs] 32073K->29384K(520064K), [CMS Perm : 20852K->20850K(131072K)], 0.2780520 secs] [Times: user=0.20 sys=0.01, real=0.28 secs] 
533.320: [Full GC (System) 533.321: [CMS: 29384K->29379K(481728K), 0.2429570 secs] 30030K->29379K(520064K), [CMS Perm : 20853K->20851K(131072K)], 0.2430970 secs] [Times: user=0.17 sys=0.00, real=0.24 secs] 
542.963: [Full GC (System) 542.963: [CMS: 29379K->29443K(481728K), 0.2208060 secs] 30708K->29443K(520064K), [CMS Perm : 20854K->20851K(131072K)], 0.2209180 secs] [Times: user=0.11 sys=0.01, real=0.22 secs] 
544.949: [Full GC (System) 544.949: [CMS: 29443K->29405K(481728K), 0.1397210 secs] 31382K->29405K(520064K), [CMS Perm : 20855K->20853K(131072K)], 1.0028950 secs] [Times: user=0.14 sys=0.00, real=1.00 secs] 
546.517: [Full GC (System) 546.517: [CMS: 29405K->29464K(481728K), 0.2366270 secs] 30051K->29464K(520064K), [CMS Perm : 20858K->20857K(131072K)], 0.2367790 secs] [Times: user=0.20 sys=0.00, real=0.23 secs] 
547.508: [Full GC (System) 547.508: [CMS: 29464K->29467K(481728K), 0.1974860 secs] 30756K->29467K(520064K), [CMS Perm : 20859K->20858K(131072K)], 0.1976260 secs] [Times: user=0.17 sys=0.01, real=0.19 secs] 
560.975: [Full GC (System) 560.975: [CMS: 29467K->29382K(481728K), 0.1750130 secs] 32092K->29382K(520064K), [CMS Perm : 20865K->20864K(131072K)], 0.1751500 secs] [Times: user=0.17 sys=0.00, real=0.17 secs] 
573.975: [Full GC (System) 573.975: [CMS: 29382K->29451K(481728K), 0.2862430 secs] 30711K->29451K(520064K), [CMS Perm : 20868K->20865K(131072K)], 0.2863970 secs] [Times: user=0.19 sys=0.00, real=0.28 secs] 
575.101: [Full GC (System) 575.101: [CMS: 29451K->29381K(481728K), 0.1325170 secs] 31390K->29381K(520064K), [CMS Perm : 20869K->20867K(131072K)], 0.1326370 secs] [Times: user=0.13 sys=0.00, real=0.14 secs] 
580.396: [Full GC (System) 580.396: [CMS: 29381K->29454K(481728K), 0.1913770 secs] 30709K->29454K(520064K), [CMS Perm : 20875K->20872K(131072K)], 0.1928260 secs] [Times: user=0.19 sys=0.00, real=0.19 secs] 
582.300: [Full GC (System) 582.300: [CMS: 29454K->29403K(481728K), 0.1238430 secs] 31392K->29403K(520064K), [CMS Perm : 20875K->20873K(131072K)], 0.1239580 secs] [Times: user=0.12 sys=0.00, real=0.13 secs] 
582.982: [Full GC (System) 582.982: [CMS: 29403K->29457K(481728K), 0.1238720 secs] 30049K->29457K(520064K), [CMS Perm : 20875K->20874K(131072K)], 0.1239860 secs] [Times: user=0.12 sys=0.00, real=0.13 secs] 
584.284: [Full GC (System) 584.284: [CMS: 29457K->29461K(481728K), 0.1820700 secs] 31431K->29461K(520064K), [CMS Perm : 20875K->20874K(131072K)], 0.1823370 secs] [Times: user=0.16 sys=0.01, real=0.18 secs] 
592.363: [Full GC (System) 592.364: [CMS: 29461K->29385K(481728K), 0.2151030 secs] 32081K->29385K(520064K), [CMS Perm : 20892K->20890K(131072K)], 0.2153350 secs] [Times: user=0.18 sys=0.02, real=0.22 secs] 
599.096: [Full GC (System) 599.096: [CMS: 29385K->29408K(481728K), 0.1319160 secs] 30713K->29408K(520064K), [CMS Perm : 20894K->20890K(131072K)], 0.1320350 secs] [Times: user=0.13 sys=0.00, real=0.14 secs]
Cyntech
  • 5,362
  • 6
  • 33
  • 47
  • 3
    You're assigned quite a small amount of memory to your java process (256mb max) is that intentional (hw limit?) - if not I'd increase the max heap size a bit and see if it makes it better. How much memory is used/needed for the operation (downloading content and processing sounds like it could be some bytes allocated...) – Peter Svensson Dec 10 '12 at 06:14
  • @PeterLiljenberg We have 4gb memory in the server, 3 of which is given to the MySQL server. I chose 256mb for the max heap size because I didn't think it would need that much. I'll increase it to 512mb and see if that has a positive effect. – Cyntech Dec 10 '12 at 12:22
  • Are you calling `System.gc()` somewhere in your code, because as I read it, it is full gc-ing from 30MB to 29MB. – Mark Rotteveel Dec 10 '12 at 19:49
  • @MarkRotteveel That's how I read it too. No, I don't call `System.gc()` anywhere. – Cyntech Dec 10 '12 at 22:41
  • 1
    @Cyntech Please, add `-XX:+DisableExplicitGC` just to rule out the `System.gc()` possibility. From the updated logs it really looks like somebody is calling it explicitly. – Aleš Dec 11 '12 at 03:59
  • System.gc() calls can come from anything using byte buffers to allocate out of heap memory. – Chris Lohfink Oct 16 '13 at 16:56

3 Answers3

3

First, use -XX:+PrintGCDetails to see the full set of GC stats for each collection. Then you should compute the size of Live Data Set, Allocation Rate, and Promotion Rate. This will tell you if you need a bigger Heap or if your eg. Young Gen is too small. See Advanced JVM Tuning for tips howto compute these stats. After you have these stats, you can Tune your Young generation with -XX:InitialSurvivorRatio=3 -XX:SurvivorRatio=3 -XX:TargetSurvivorRatio=90 and sizes with e.g. -Xmn -Xmx -Xms. Adjust the numbers to your Allocation/Promotion rate.

Don't use -XX:-UseAdaptiveSizePolicy just yet. This option is to disable automatic GC ergonomics so that you can tune the Heap parameters. But if you do not know the allocation behavior of your app, then this parameter won't help. Further, the Adaptive Size Policy is enabled only for the Throughput collector (-XX:+UseParallelOldGC).

Also, If you want to tune your GC for throughput, you should be using the -XX:+UseParallelOldGC and e.g. -XX:ParallelGCThreads=4. This will enable multithreaded Young/Old Generation GCs (provided that you run on multicore).

Further, setup the tenuring with -XX:MaxTenuringThreshold=15 and check with -XX:+PrintTenuringDistribution.

UPDATE:

Looking at the logs, please, could you use also -XX:+DisableExplicitGC, because Full GC (System) means that somebody is calling System.gc().

Aleš
  • 8,896
  • 8
  • 62
  • 107
  • 1
    As far as I know if you specify `-Xms` and `-Xmx` with the same value, then adaptive size policy is disabled anyway, because the JVM has no 'room' to tune its parameters. – Mark Rotteveel Dec 10 '12 at 19:45
  • 1
    I've updated the post with detailed gc stats. There's no mention of PSYoungGen GC to calculate the allocation rate, unless you're referring to the ParNew GC? The Full GC's seem to be CMS related and what confuses me is that it's not going anywhere near the upper limit of the allocated memory yet still doing a Full GC... – Cyntech Dec 11 '12 at 01:47
  • Looking at the logs, please, could you use also `-XX:+DisableExplicitGC`, because **Full GC (System)** means that somebody is calling `System.gc()`. – Aleš Dec 11 '12 at 03:26
  • @MarkRotteveel interesting comment but I think that just specifying the `-Xms` and `-Xmx` does not have the same effect as disabling the adaptive size policy explicitly. The policy is used also for Young Gen and Survivor sizes. Also, some of the parameters cannot be used with the Policy on, eg. `-XX:SurvivorRatio=`. So, although its almost the same, its not the same. :-) – Aleš Dec 11 '12 at 03:43
  • I seem to remember reading that if -Xmx and -Xms are the same, then the total of young generation + survivor spaces are also fixed in size and only their relative size will be tuned (it might be different with G1 see http://weblogs.java.net/blog/kcpeppe/archive/2011/12/19/setting-xmx-xms-still-considered-harmful ) – Mark Rotteveel Dec 11 '12 at 08:44
  • Sharakan also came up with the same conclusion (below). I'm not calling `System.gc()` but I'm using a couple of libraries (mysql connector, commons-io, oauth-signpost, slf4j/logback and Simon Tuff's One-Jar). – Cyntech Dec 12 '12 at 00:29
  • Yeah, so you should be able to use `-XX:+DisableExplicitGC` and see if that helps. – Aleš Dec 12 '12 at 05:15
1

Based on those parameters, you shouldn't be hitting a full GC based on your old generation running low on space, as the old gen should have a capacity of over 200MB, and you're only apparently using 30MB.

The CMS collector can give you more information about what's going on if you use the PrintGCDetails flag. Add this to your parameters: -XX:+PrintGCDetails.

My first thought was that you would fine that the full GC's are caused by concurrent mode failures. Some reading about those:

Edit:

Based on your detailed GC logs, and some further looking about, it seems that when the logs says Full GC (System), it's an indication of an explicit GC invocation. If you're not calling it yourself, perhaps a library you're using is?

The answer there refers to making a custom Runtime class that can help you track down who's calling GC. One way to do that in Java 6 is to follow these instructions to add things to your bootclasspath.

Community
  • 1
  • 1
sharakan
  • 6,821
  • 1
  • 34
  • 61
  • I've added detailed gc logging. There's mention of CMS GC but nothing of a concurrent mode failure. – Cyntech Dec 11 '12 at 01:48
  • Right, but now it's saying 'Full GC (System)' which implies an invocation of System.gc(). Updated my answer. – sharakan Dec 11 '12 at 14:52
  • It's entirely possible, however, writing a custom runtime class might be a bit tricky as I'm using One-Jar to package my program into an executable jar. Other libraries I'm using are mysql connector, commons-io, oauth-signpost, and slf4j/logback). – Cyntech Dec 12 '12 at 00:30
  • @Cyntech You should be able to use the bootclasspath approach even with one-jar, because you still have access to the Java command line prompt. You'd want to build the runtime jar separately, but when you run your jar, just instead of java -jar my-app.jar (from the One-Jar site) use java -Xbootclasspath:runtime.jar -jar my-app.jar. That said, -XX:+DisableExplicitGC should also do the trick. – sharakan Dec 12 '12 at 04:33
1

Memory is cheap, and a young gen is much cheaper to do than an old gen GC.

Try maybe 2 or 4 GB heap... should be much fewer full GCs (maybe 0 if your app is well behaved enough). A parnew should only take a few millis.

bwawok
  • 14,898
  • 7
  • 32
  • 43