0

I'm trying to write a code that will have a minimal impact on resources and I have come across GC behavior I don't understand.

  1. Apparently Strings are not cleared from the memory immediately even though they are not in use anymore.

    for(int i = 0; i < 999999999; i++)
        System.out.println("Test");
    

Memory usage graph

according to the graph I assume that a new String object is created on every run of the loop but it is not cleared automatically on the next run of the loop - if that is the case I would like to know why is it happening and in case I'm misreading the situation I would like to know what really is happening "behind the curtains".

  1. When I add Sleep to the code I presented above the graph becomes stable, what is the reason for that?

    for(int i = 0; i < 999999999; i++){
    
        System.out.println("Test");
    
        try{
            Thread.sleep(1);
        }
        catch(Exception e){}
    }
    

Stable graph

Also I have a few question about the given case:

  • Can GC be forced to be more aggressive? I mean shorten the object lifetime and not reducing the memory allocated by JVM?

  • If I plug in a null value to the variable will it affect the time until it's cleared by the GC?

  • What is the correct way to work with Strings when I need to run a large number of regex matches on them?

  • What is the best way to declare a String object "obsolete" so the GC will clear it?

  • Does the above situation occur because Java does an automatic intern for Strings and if so is there a way to cancel it?

Thank you very much!

  • 2
    You're completely wrong. There's only one String created in your example, no matter how many times you loop. – Kayaman Dec 08 '16 at 11:50
  • 1
    "Test" will be [interned](http://stackoverflow.com/questions/10578984/what-is-string-interning), there is only ever going to be one instance of it – Alex K. Dec 08 '16 at 11:50
  • @Kayaman can you please be more specific?, what is the reason for to high memory usage... – user3625158 Dec 08 '16 at 11:57
  • i assume that the problem in when you use System.out.println("Test") with loop 999999999 times. you are writting it in the screen (an it used memory). the final memory used is 999999999 * 4 bit. This is equivalent use a big arrays of string. if you replace. System.out.println("Test") with example String test = "test". – toto Dec 08 '16 at 12:17
  • @Kayaman, i get same result in my project so i created the simplest code to show the issue... can someone tell me why the memory usage is so high? 50MB RAM for code with only print&for loop it does not make sense. thank you very much! – user3625158 Dec 08 '16 at 12:18
  • The sawtooth pattern is normal. It's what you usually see when profiling any application. You're assuming that the program is only looping, but you've got a profiler connected to it, so there's a lot more going on, resulting in the additional memory usage. – Kayaman Dec 08 '16 at 12:28
  • I remember seeing an almost identical, now deleted, question ([link for high-rep users](http://stackoverflow.com/q/40981939/2711488)) of a different user two days ago. As already said there, if you tell the JVM to use a certain amount of memory, it will use that amount of memory, instead of wasting CPU time trying to use less than specified. If you don’t like it, you can assign less memory to the JVM, but of course, that implies that you will potentially slow down your application. – Holger Dec 08 '16 at 12:48

2 Answers2

0

The Garbage Collector collects when its time to collect, more or less.

  • Yes, depending on what collector you are using. There's literally dozens of vm properties you can set, some of them influencing each other.
  • I don't think it does in 'newer' JDK's
  • Normally you do not care. When it comes to GC, it's more about not loading tons of gigs of data into your memory. One specialty about strings are its its interns, but Strings will be gc'd like other objects, too.
  • When there's no reference to the string/intern anymore (when you exit the braces)
  • No, the situation does occur, because java's GC's work this way...

I can explain the GC effects on base on CMS/ParNew (since I know this combo best), it works like this: The heap is splitted into two regions (i exclude PermGen for now). Young and Old Young is split into 'eden' and 'copy' (or survivor) When you generate a new object, it will go Young->Eden. At some point, the eden will reach its max memory, then not used objects will be removed, objects still having references will be copied to Young->Copy.

As the program keeps running, Young->Copy will reach its max memory. It will be copied again in another Young->Copy memory space.

At some point, it can't do that anymore, so some objects it will be moved from Young->Copy to Old, depending on a copy counter (I think). Same story for the old heap.

So what can you tune? First of all, you normally have throughput (batching) and low-latency (webpages), the ParNew/CMS combo was used for low-latency.

Since I know ParNew/CMS best, I'll explain what you can consider tuning first:

  • You can tune max memory (more memory means more managing, the less memory an application needs to run, the better... in general)
  • You can tune heap ration between young and old
  • You can tune the ratios between eden and copy within young
  • You can tune the time, when CMS starts its collection cycle

And then there's a lot more. From my personal experience, for large applications, we used in general the following settings:

  • Fix min and max memory to the same size (no change of max heap)
  • New Ratio to Old something about 1:4 to 1:7
  • Disable System.gc()
  • Log a lot of gc stuff
  • put an alert on OutOfMemory
  • do weekly analysis on the log and decide on tuning parameters. (Only one parameter at a time ;)

If you really want to know what's behind everything, I'd recommend reading a book, because there's really, really, really a lot going on.

slowy
  • 227
  • 1
  • 6
0

I assume that a new String object is created on every run of the loop

No, if it was creating a new String on each iteration you would get far more garbage.

At this garbage rate it could be the profiler which is allocating some objects.

A String literal is create once ever. (In a JVM)

but it is not cleared automatically on the next run of the loop

Correct, even if it was created on each iteration the GC only runs when it needs to, doing it on each iteration would be insanely expensive.

When I add Sleep to the code I presented above the graph becomes stable, what is the reason for that?

You have dramatically slowed down your application.

Can GC be forced to be more aggressive?

You can make the Eden space much smaller, but this would slow down your application.

If I plug in a null value to the variable will it affect the time until it's cleared by the GC?

No, this rarely does anything.

What is the correct way to work with Strings when I need to run a large number of regex matches on them

regex's create a lot of garbage. If you want to reduce allocations and speed up your application, avoid using regex's.

I recently speed up an application by 3x by replacing some commonly used regex with direct String handling.

What is the best way to declare a String object "obsolete" so the GC will clear it?

Use it in a limited scope. When the scope ends so does the reference to it and it can be GCed.

Does the above situation occur because Java does an automatic intern

Once a String is interned it is not recreated.

for Strings and if so is there a way to cancel it?

Sure, force it create a new String each time. This of course creates more garbage and is much slower (and the code is longer) but you can do it if you want.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130