Garbage Collector taking too much CPU Time

Question

I've developed a Web Application which process a huge amount of data and takes a lot of time to complete?

So now I am doing profiling of my application and I noticed one very bad thing about GC.
When a Full GC occurred it stops all process for 30 - 40 secs.

I wonder if there is any way to improve this. I don't want to waist my CPU's that much time only in GC. Below are some details that can be useful:

I am using Java 1.6.0.23
My Application takes 20 GB max memory.
A full GC occur after every 14 minutes.
Memory Before GC is 20 GB and after GC is 7.8 GB
Memory used in CPU (i.e. shown in task manager) is 41 GB.
After process completed(JVM is still running) Used memory 5 GB and free memory 15 GB.

The *specific* JVM being used and memory options, if any, need to be annotated in the question. Also include any profiling results and how they were obtained. — , Mar 09 '12 at 09:49
You can periodically suggest the GC to run in strategically selected points in time to make more collections that last less time. — Mister Smith, Mar 09 '12 at 09:51
We can't tell you what to do unless you tell us which Java VM you're using. Try `java -version` — Aaron Digulla, Mar 09 '12 at 10:01

score 5 · Answer 1 · answered Mar 09 '12 at 10:26

5

There are many algorithms that modern JVM's use for garbage collection. Some algorithms such as reference counting are so fast, and some such as memory copying are so slow. You could change your code so that help the JVM to use the faster algorithms most of the time.

One of the fastest algorithms is reference counting, and as the name describes, it counts references to an object, and when it reaches zero, it is ready for garbage collection, and after that it decreases reference count to objects referenced by the current GCed object.

To help JVM to use this algorithm, avoid having circular references (object A references B, then B references C, C references D ...., and Z references A again). Because even when the whole object graph is not reachable, none of the object's reference counters reaches zero.

You could only just break the circle when you don't need the objects in the circle any more (by assigning null to one of references)....

answered Mar 09 '12 at 10:26

Amir Pashazadeh

7,170
3
39
69

1

While it is true there are numerous GC mechanisms, and reference counting is one of them, it's long since been relegated to disuse. The current JVM uses several different techniques to bypass the kind of reference cycles you identify. The first and simplest is "live copy". This doesn't look at all allocated objects. Instead, it starts at the thread's associated objects (roots) and then begins copying all the objects reachable to a new space. And once completed, it just deallocates anything that's left. So, any sort of reference cycles which don't have a path from a thread root go away. – chaotic3quilibrium Mar 09 '12 at 20:36
I know that (I refered to it as memory copying), but that's one of the most time consuming mechanisms of GC, compared to reference counting. So if one helps the GC to use a reference counting mechanism instead of live copy, it can prevent the slow live-copy mechanism. – Amir Pashazadeh Mar 11 '12 at 15:59
I used to think what you are thinking. I would go spend some time reviewing the design documentation and discussions around the current JVM GCs. They are designed very strongly around typical garbage generation patterns. And while reference counting sounds desirable, it ends up only being so in very unique and unusual cases . Basically, I have to pervert my design just so as to avoid ANY cycles in any of my reference pathways. Of course it's possible. However, it is also extremely limiting on all but the simplest of OO models (at least in my world, anyway). – chaotic3quilibrium Mar 11 '12 at 21:50

Andrzej Jozwik · Answer 2 · 2012-03-09T10:05:56.373

4

If you use 64 bit architecture add:

-XX:+UseCompressedOops 64bit addresses are converted to 32bit

Use G1GC instead of CMS:

-XX:+UseG1GC - it use incremental steps

Set the same initial and max size: -Xms5g -Xmx5g

Tune parameters (just example):

-XX:MaxGCPauseMillis=100 -XX:GCPauseIntervalMillis=1000

See Java HotSpot VM Options Performance Options

edited Mar 09 '12 at 10:05

answered Mar 09 '12 at 09:58

Andrzej Jozwik

14,331
3
59
68

darijan · Answer 3 · 2015-08-06T13:48:07.417

4

Either improve app by reusing resources or kick-in System.gc() yourself in some critical regions of the app (which is not guaranteed to help you). Most likely you have a memory leak somewhere that you have to investigate and consequently restructure the code.

edited Aug 06 '15 at 13:48

answered Mar 09 '12 at 10:00

darijan

9,725
25
38

2

Calling System.gc() yourself is pretty much always a bad idea. If you're relying on that call for performance reasons, it's a good indicator that there's something wrong with the code. Also, it's not even guaranteed to do anything, i.e. it's just a request that the JVM can ignore completely. – Black Mar 03 '14 at 23:25

score 3 · Answer 4 · answered Mar 09 '12 at 10:08

3

The amount of time spent in GC depends on two factors:

How many objects are live (= can be reached from anyone)
How many dead objects implement finalize()

Objects which can't be reached and which don't use finalize() cost nothing to clean up in Java which is why Java is usually on par with other languages like C++ (and often much better because C++ spends a lot of time to delete objects).

So what you need to do in your app is cut down on the number of objects that survive and/or cut references to objects (that you no longer need) earlier in the code. Example:

When you have a very long method, you will keep all the objects alive that you reference from local variables. If you split that method in many smaller methods, the references will be lost faster and the GC won't have to deal with those objects.

If you put everything that you might need in huge hash maps, the maps will keep all those instances alive until your code completes. So even when you don't need those anymore, the GC will still have to spend time on them.

answered Mar 09 '12 at 10:08

Aaron Digulla

321,842
108
597
820

Thanks Aaron which points u told here may be the problem in my application. Because there are some huge methods and I am using many Maps and Lists to Cache data. But I have to cache data because I used them very frequently. – NIVESH SENGAR Mar 09 '12 at 10:27
So GC times get worse and worse the more objects you have alive, even if none are becoming dead? – weston Mar 09 '12 at 13:12
@weston: Well, live != dead. Dead objects don't count unless they have a `finalize()` method (which the GC has to call before it can forget about them) or they lived a long time (and were moved out of eden space). – Aaron Digulla Mar 09 '12 at 16:32
@weston The short simple answer is "yes". The longer answer is "depends". The GC becomes more aggressive as the body of live objects grows past 80% of total available space. This is done in an attempt to thwart a possible future out of memory problem. So, on systems which need to have long-lived objects, other kinds of caching strategies are used; ones that are much less memory expensive with the trade-off being accessing the data is much slower. – chaotic3quilibrium Mar 09 '12 at 20:31
@chaotic3quilibrium Well I don't know the limit of this guys system, but his app is using 20GB, so unless he's got some 32GB monster server then I guess it will be in agressive mode! – weston Mar 10 '12 at 10:03
1

@AaronDigulla I think I get it now, short lived objects clean up easy. I'd like to share this link if anyone is interested on how it works more. http://www.ibm.com/developerworks/java/library/j-jtp11253/ (Your JVM GC may vary). – weston Mar 10 '12 at 10:13
@weston Yeah. You're right. Without looking at the design, I would imagine attempting to break the application into separate processes to reduce the GC as the bottleneck. For example, caches that were static or largely static (very small change), I would isolate into another VM. That would pull down some of the memory searched by the GC in the main app. I would repeat this pattern until the main app was mostly calling other sub-apps, delegating to these "sub-apps" and then tuning those. – chaotic3quilibrium Mar 11 '12 at 21:54

Mike Dunlavey · Accepted Answer · 2012-03-09T13:30:33.630

3

The fewer things you new, the fewer things need to be collected.

Suppose you have class A. You can include in it a reference to another instance of class A. That way you can make a "free list" of instances of A. Whenever you need an A, just pop one off the free list. If the free list is empty, then new one.

When you no longer need it, push it on the free list.

This can save a lot of time.

edited Mar 09 '12 at 13:30

answered Mar 09 '12 at 13:25

Mike Dunlavey

40,059
14
91
135

2

I've read that "Object Pooling" can be an anti-pattern in all but cases where it is expensive to construct the object see here: http://www.ibm.com/developerworks/java/library/j-jtp01274/index.html under "Object Pooling" – weston Mar 09 '12 at 14:23
Apache Commons Pool, provides a simple framework for object pooling, but if the objects are simple objects (which are not expensive to construct), there is no gain. By the way if the object graph which shall be created contains lots of objects, and the graph structure is always the same, maybe that could help. – Amir Pashazadeh Mar 09 '12 at 15:01
@weston: I only do it after manual stack sampling proves allocation and deallocation takes a major percentage of time. Also, people write what they feel like writing, so if somebody says something is an anti-patttern that doesn't mean it is. It's good to reserve your own judgement. – Mike Dunlavey Mar 09 '12 at 15:09
`new` almost doesn't contribute to GC times. You can create a list with millions of objects (that will take some time) and then set the reference to the list to `null` - when the GC runs the next time, it will take just the same time as if the list was never created. Unreachable objects don't contribute to GC times. – Aaron Digulla Mar 09 '12 at 16:34
@AaronDigulla: Why is that? You mean I can create a massive memory leak, and it won't bother the GC even a little? I guess that's amazing :) – Mike Dunlavey Mar 09 '12 at 17:31
Fair enough, and the guy didn't say it was a bad idea, but that it should be reservered for those times where it does have an actual benefit, which is what you have now said. PS, glad you and @AaronDigulla are discussing this, because you seem to have completely opposing views on this. One side says more objects= slower GC, and the other side says keep objects around to improve GC! – weston Mar 10 '12 at 10:00
Ps I was paraphrasing when I said it was an antipattern. The guy just said "...in many cases these techniques can do more harm than good to your program's performance." – weston Mar 10 '12 at 10:09
@weston: Look, here's what I do. I take samples manually. (I could explain why that is effective.) If I take, say, 10 samples, and 2 or more of them are inside `new`, then if I could pool the objects, I would save 20% of time, give or take. `new` of non-stack objects takes hundreds of instructions, GC or no GC. – Mike Dunlavey Mar 10 '12 at 14:15
How do you take a stack sample? Just pause the program and see where it is? – weston Mar 10 '12 at 19:32
@weston: Sure. If something would save you 20% of time if you fixed it, you will see it on 20% of samples, more or less, and the difference between that and profiling is no profiler can match the human eye for recognizing useful patterns. And the interesting thing is, you don't need a lot of samples, because if you see something you can fix on as few as 2 samples, you've got a live one. *[More on that.](http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343)* – Mike Dunlavey Mar 11 '12 at 01:35

Garbage Collector taking too much CPU Time

5 Answers5