3

I tried to increase my heap memory like this:

-Xmx9g -Xms8g

to be honest, just because I can.

Now I'm wondering, why the JVM doesn't use more of it, and schedules the GC less frequent.

Heap Usage Graph

System:
JVM: Java HotSpot(TM) 64-Bit Server VM (24.51-b03, mixed mode)
Java: version 1.7.0_51, vendor Oracle Corporation

Edit:
I want to improve my configuration for a modelling process (throughput beats responsiveness).

Franz Ebner
  • 4,951
  • 3
  • 39
  • 57
  • 1
    What happens when every application made has the mindset - because I can, I take large heaps? Not too much heap left. – ChiefTwoPencils Jun 10 '14 at 09:24
  • Do you have any other GC settings on your VM? Is there a reason you want to use all of the heap before GC? There are many specifics to an application that affect GC such as lots of short lived objects which can be collected often without full pause. Why wait until the bath is overflowing before turning taps off? – James Jun 10 '14 at 09:32
  • Please provide any GC specific settings that you might have. And are you seeing any significant performance impact with GC being run like this? – Praba Jun 10 '14 at 09:48

2 Answers2

4

The HotSpot JVM in Java 1.7 separates the heap into a few spaces, and the relevant ones for this discussion are:

  • Eden, where new objects go
  • Survivor, where objects go if they're needed after Eden is GC'ed

When you allocate a new object, it's simply appended into the Eden space. Once Eden is full, it's cleaned in what's known as a "minor collection." Objects in Eden that are still reachable are copied to Survivor, and then Eden is wiped (thus collecting any objects that weren't copied over).

What you want is to fill up Eden, not the heap as a whole.

For example, take this simple app:

public class Heaps {
  public static void main(String[] args) {
    Object probe = new Object();
    for (;;) {
      Object o = new Object();
      if (o.hashCode() == probe.hashCode()) {
        System.out.print(".");
      }
    }
  }
}

The probe stuff is just there to make sure the JVM can't optimize away the loop; the repeated new Object() is really what we're after. If you run this with the default JVM options, you'll get a graph like the one you saw. Objects are allocated on Eden, which is just a small fraction of the overall heap. Once Eden is full, it triggers a minor collection, which wipes out all those new objects and brings the heap usage down to its "baseline," close to 0.

So, how do you fill up the whole heap? Set Eden to be very big! Oracle publishes its heap tuning parameters, and the two that are relevant here are -XX:NewSize and -XX:MaxNewSize. When I ran the above program with -Xms9g -XX:NewSize=8g -XX:MaxNewSize=8g, I got something closer to what you expected.

enter image description here

In one run, this used up nearly all of the heap, and all of the Eden space I specified; subsequent runs only took up a fraction of the Eden I specified, as you can see here. I'm not quite sure why this is.

VisualVM has a plugin called Visual GC that lets you see more details about your heap. Here's a screen shot from mine, taken at just the right moment that shows Eden nearly full, while the old space is pretty much empty (since none of those new Object()s in the loop survive Eden collections).

enter image description here

Community
  • 1
  • 1
yshavit
  • 42,327
  • 7
  • 87
  • 124
  • Thanks for the description... I was aware of Spaces, but what I still don't get: wouldn't it be more performant to reduce the GC frequency? In certain circumstances? – Franz Ebner Jun 10 '14 at 10:36
  • In certain circumstances, sure. That's why the people who designed the JVM provided and documented their heap tuning parameters. The defaults were picked to work against a broad set of general apps, and if you need to fine-tune them (e.g. to reduce GC frequency at the cost of more RAM), you can. – yshavit Jun 10 '14 at 10:42
  • That depends; if you reduce the frequency, that means you get less collections which are a lot more CPU intensive and thus are more likely to intrude with application response time. I've seen cases where the garbage collector is allowed to run only once per day and that can take a full hour to complete. – Gimby Jun 10 '14 at 10:44
  • @yshavit I've probably just missed something in the answer (or the question!), but why do "you want is to fill up Eden, not the heap as a whole." ? – matt freake Jun 10 '14 at 11:16
  • 1
    @Disco3 because The minor GC will happen when Eden is full, regardless of what the heap as a whole is doing. If Eden is 5% of your 100mb heap (it's more by default), and all your objects are short-lived (as in my example), you'll hit a GC after 5mb of files, and the other 95mb of heap will never be used -- precisely the sort of situation the OP's question seems to be. – yshavit Jun 10 '14 at 11:30
3

(I'll try to answer the question of "why" from a different angle here.)

Normally you want to balance two things with your GC settings: throughput and responsiveness.

Throughput is determined by how much time is spent doing GC overall, responsiveness is determined by the lengths of the individual GC runs. Default GC settings were determined to give you a reasonable compromise between the two.

High throughput means that measured over a long period of time the GC overhead will be less. High responsiveness on the other hand will make it more likely that a short piece of code will run in more or less the same time and won't be held up for very long by GC.

If you tune your GC parameters to allow the filling of all 9GBs of heap, what you'll find is that the throughput might have increased (although I'm not certain that it always will) but when the GC does eventually run, your application freezes for several seconds. This might be acceptable for a process that runs a single, long-running calculation but not for a HTTP server and even less so for a desktop application.

The moral of the story is: you can tune your GC to do whatever you want but unless you've got a specific problem that you diagnosed (correctly), you're likely to end up worse than with the default settings.

Update: Since it seems you want high throughput but aren't bothered about pauses, your best option is to use the throughput collector (-XX:+UseParallelGC). I obviously can't give you the exact parameters, you have to tune them using this guide by observing the effects of each change you make. I probably don't need to tell you this but my advice is to always change one parameter at a time, then check how it affects performance.

biziclop
  • 48,926
  • 12
  • 77
  • 104