I have a service that reads data from a source, performs some conversion to the data and then uploads the converted to a destination. When picking GC algorithm, I am looking for the one with high throughput and that was why I picked parallel GC. The part that confused me a lot is why I saw a good amount of Full GC. The nature of the service makes most of the objects short living as data comes and goes. Here is my GC config:
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -verbose:GC -XX:+UseParallelGC -XX:NewSize=21200m -XX:MaxNewSize=21200m -server -Xms31200m -Xmx31200m
Basically, I set the total heap size to 30GB and the young gen size to 20GB.
Here is a piece of the GC log:
2020-11-08T08:31:07.863+0000: 215.876: [Full GC (Ergonomics) [PSYoungGen: 1233347K->0K(18729472K)] [ParOldGen: 9065862K->6633660K(10240000K)] 10299209K->6633660K(28969472K), [Metaspace: 107588K->107588K(1144832K)], 1.1350824 secs] [Times: user=21.03 sys=0.00, real=1.14 secs]
2020-11-08T08:31:10.627+0000: 218.640: [GC (GCLocker Initiated GC)
Desired survivor size 2699034624 bytes, new threshold 1 (max 15)
[PSYoungGen: 15874560K->1274938K(19073024K)] 22513996K->7914375K(29313024K), 0.1073842 secs] [Times: user=3.10 sys=0.00, real=0.11 secs]
2020-11-08T08:31:12.319+0000: 220.331: [GC (GCLocker Initiated GC)
Desired survivor size 2587885568 bytes, new threshold 1 (max 15)
[PSYoungGen: 17602106K->1307000K(18962944K)] 24253865K->8618788K(29202944K), 0.2492961 secs] [Times: user=7.16 sys=0.00, real=0.25 secs]
2020-11-08T08:31:14.197+0000: 222.210: [GC (GCLocker Initiated GC)
Desired survivor size 2480930816 bytes, new threshold 1 (max 15)
[PSYoungGen: 17634168K->1333816K(19286016K)] 24952891K->9297010K(29526016K), 0.2524904 secs] [Times: user=7.07 sys=0.00, real=0.25 secs]
2020-11-08T08:31:16.165+0000: 224.178: [GC (GCLocker Initiated GC)
Desired survivor size 2386558976 bytes, new threshold 1 (max 15)
[PSYoungGen: 18092600K->1313137K(19181568K)] 26062932K->9992006K(29421568K), 0.2845171 secs] [Times: user=7.85 sys=0.00, real=0.29 secs]
2020-11-08T08:31:18.084+0000: 226.096: [GC (GCLocker Initiated GC)
Desired survivor size 2312110080 bytes, new threshold 1 (max 15)
[PSYoungGen: 18071921K->1242981K(19450880K)] 26751020K->10584632K(29690880K), 0.2523254 secs] [Times: user=6.79 sys=0.00, real=0.26 secs]
2020-11-08T08:31:18.336+0000: 226.349: [Full GC (Ergonomics) [PSYoungGen: 1242981K->0K(19450880K)] [ParOldGen: 9341651K->6896991K(10240000K)] 10584632K->6896991K(29690880K), [Metaspace: 107625K->107625K(1144832K)], 1.0198299 secs] [Times: user=18.34 sys=0.08, real=1.02 secs]
2020-11-08T08:31:21.049+0000: 229.062: [GC (GCLocker Initiated GC)
Desired survivor size 2221408256 bytes, new threshold 1 (max 15)
[PSYoungGen: 17120256K->1356565K(19378176K)] 24043241K->8279559K(29618176K), 0.1089915 secs] [Times: user=3.38 sys=0.00, real=0.11 secs]
2020-11-08T08:31:22.887+0000: 230.899: [GC (GCLocker Initiated GC)
Desired survivor size 2155872256 bytes, new threshold 1 (max 15)
[PSYoungGen: 18476821K->1265473K(19603456K)] 25426058K->8896652K(29843456K), 0.2524566 secs] [Times: user=7.14 sys=0.00, real=0.25 secs]
2020-11-08T08:31:24.888+0000: 232.901: [GC (GCLocker Initiated GC)
Desired survivor size 2092433408 bytes, new threshold 1 (max 15)
[PSYoungGen: 18699585K->1388375K(19539456K)] 26345045K->9562491K(29779456K), 0.2113546 secs] [Times: user=5.59 sys=0.00, real=0.21 secs]
2020-11-08T08:31:26.819+0000: 234.832: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18822487K->1308016K(19726336K)] 27003840K->10002863K(29966336K), 0.2078162 secs] [Times: user=6.10 sys=0.00, real=0.21 secs]
2020-11-08T08:31:28.868+0000: 236.881: [GC (GCLocker Initiated GC)
Desired survivor size 2030043136 bytes, new threshold 1 (max 15)
[PSYoungGen: 18990960K->1521040K(19665408K)] 27712283K->10780549K(29905408K), 0.2373748 secs] [Times: user=6.60 sys=0.00, real=0.23 secs]
2020-11-08T08:31:29.106+0000: 237.119: [Full GC (Ergonomics) [PSYoungGen: 1521040K->0K(19665408K)] [ParOldGen: 9259509K->7378423K(10240000K)] 10780549K->7378423K(29905408K), [Metaspace: 107653K->107653K(1144832K)], 1.0809680 secs] [Times: user=20.55 sys=0.00, real=1.09 secs]
There are a couple of things from the log that really confused me:
- How does JVM figure out the Desired survivor size? Why it was around 2.5 GB? Why it changed a little bit in each soft GC? Why did total old gen size never changed (10240000K) but the total young gen size changed all the time?
- Why the *new threshold was always 1? Wasn't this just way too aggressive to move things into old gen?
- After each soft GC, the young gen most likely had around 1.3GB of data and some amount of the data were moved to old gen. That caused old gen to gradually get full and Full GC eventually happened to cleanup old gen. Why a portion of data were moved to old gen in each soft GC? Looks like the survivor space is big enough.
- What I can do to avoid unnecessary Full GC so that the overall throughput can be improved?