24

I was reading Java Platform Performance (sadly the link seems to have disappeared from the internet since I originally posed this question) and section A.3.3 worried me.

I had been working on the assumption that a variable that dropped out of scope would no longer be considered a GC root, but this paper appears to contradict that.

Do recent JVMs, in particular Sun's 1.6.0_07 version, still have this limitation? If so, then I have a lot of code to analyse...

I ask the question because the paper is from 1999 - sometimes things change, particularly in the world of GC.


As the paper is no longer available, I'd like to paraphrase the concern. The paper implied that variables that were defined inside a method would be considered a GC root until the method exited, and not until the code block ended. Therefore setting the variable to null was necessary to permit the Object referenced to be garbage collected.

This meant that a local variable defined in a conditional block in the main() method (or similar method that contained an infinite loop) would cause a one-off memory leak unless you nulled a variable just before it dropped out of scope.

The code from the chosen answer illustrates the issue well. On the version of the JVM referenced in the document, the foo object can not be garbage collected when it drops out of scope at the end of the try block. Instead, the JVM will hold open the reference until the end of the main() method, even though it is impossible for anything to use that reference.

This appears to be the origin of the idea that nulling a variable reference would help the garbage collector out, even if the variable was just about to drop out of scope.

Community
  • 1
  • 1
Bill Michell
  • 8,240
  • 3
  • 28
  • 33
  • Good question. I thought I knew the internals of Java pretty well, but this one has got me. – Steve McLeod Nov 07 '08 at 09:41
  • Define "appears to contradict". What does it actually say? Providing a link that doesn't work and referring vaguely to an apparent contradiction without quoting it explicitly isn't much use to anyone. – user207421 Jul 04 '13 at 23:15
  • I don't have the original document to reference, so I can't quote it verbatim :-( – Bill Michell Jul 05 '13 at 10:28
  • Then you don't have a real question that anybody can answer. – user207421 Jul 06 '13 at 01:13
  • I've tried to reconstruct the meat of the article as best I can. I got some very good answers when I first asked this back in 2008, and other answers on this site now refer to this one... – Bill Michell Jul 08 '13 at 08:36

4 Answers4

6

This code should clear it up:

public class TestInvisibleObject{
  public static class PrintWhenFinalized{
    private String s;
    public PrintWhenFinalized(String s){
      System.out.println("Constructing from "+s);
      this.s = s;
    }
    protected void finalize() throws Throwable {
      System.out.println("Finalizing from "+s);
    }   
  }
  public static void main(String[] args) {
    try {
        PrintWhenFinalized foo = new PrintWhenFinalized("main");
    } catch (Exception e) {
        // whatever
    }
    while (true) {
      // Provoke garbage-collection by allocating lots of memory
      byte[] o = new byte[1024];
    } 
  }
}

On my machine (jdk1.6.0_05) it prints:

Constructing from main

Finalizing from main

So it looks like the problems has been fixed.

Note that using System.gc() instead of the loop does not cause the object to be collected for some reason.

Community
  • 1
  • 1
Rasmus Faber
  • 48,631
  • 24
  • 141
  • 189
  • 2
    In answer to your final statement on System.gc(): Calling System.gc() does not guarantee an immediate collection - it's just a request. It may be completely ignored. – Leigh Nov 07 '08 at 11:47
  • Indeed, the -XX:+DisableExplicitGC option has been added to the JVM to turn the call into a NO-OP. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6200079 makes it clear this option will never be turned on by default, sadly... – Bill Michell Nov 07 '08 at 12:00
  • In an earlier iteration of the example I constructed a PrintWhenFinalized-object that was completely unreachable (it was created and abandoned in another method). That one was collected when calling System.gc(), so it was not because the call was ignored. – Rasmus Faber Nov 07 '08 at 12:49
  • Even more interesting: Calling System.gc() prevents the invisible object from ever being collected! Adding a System.gc() call right before the while-loop, I never see the finalizing message. – Rasmus Faber Nov 07 '08 at 12:54
  • Rasmus, that's not completely true :) The code above have to be modified a little after the catch block: System.gc(); List l = new LinkedList(); System.out.println("starting the loop"); while (true) { byte[] o = new byte[1024]; l.add(o); } Voila - it works :) – Lazarin Nov 07 '08 at 14:23
  • 1
    The problem with this example is that it introduces a new local variable after the old went out of scope. So `byte[] o` will reuse the storage of `PrintWhenFinalized foo` in the stack frame. So this code is not capable of demonstrating whether dangling references are still a problem, as it has none. But it demonstrates that the actual issue is rarely relevant in practice as more than often, there are other variables taken older variables’ place like in this example. Further, the optimizer does eliminate unused variables, hidden or not, for hot code. – Holger Oct 11 '21 at 13:21
3

The problem is still there. I tested it with Java 8 and could prove it.

You should note the following things:

  1. The only way to force a guaranteed garbage collection is to try an allocation which ends in an OutOfMemoryError as the JVM is required to try freeing unused objects before throwing. This however does not hold if the requested amount is too large to ever succeed, i.e. excesses the address space. Trying to raise the allocation until getting an OOME is a good strategy.

  2. The guaranteed GC described in Point 1 does not guaranty a finalization. The time when finalize() methods are invoked is not specified, they might be never called at all. So adding a finalize() method to a class might prevent its instances from being collected, so finalize is not a good choice to analyse GC behavior.

  3. Creating another new local variable after a local variable went out of scope will reuse its place in the stack frame. In the following example, object a will be collected as its place in the stack frame is occupied by the local variable b. But b last until the end of the main method as there is no other local variable to occupy its place.

    import java.lang.ref.*;
    
    public class Test {
        static final ReferenceQueue<Object> RQ=new ReferenceQueue<>();
        static Reference<Object> A, B;
        public static void main(String[] s) {
            {
                Object a=new Object();
                A=new PhantomReference<>(a, RQ);
            }
            {
                Object b=new Object();
                B=new PhantomReference<>(b, RQ);
            }
            forceGC();
            checkGC();
        }
    
        private static void forceGC() {
            try {
                for(int i=100000;;i+=i) {
                  byte[] b=new byte[i];
                }
            } catch(OutOfMemoryError err){ err.printStackTrace();}
        }
    
        private static void checkGC() {
            for(;;) {
                Reference<?> r=RQ.poll();
                if(r==null) break;
                if(r==A) System.out.println("Object a collected");
                if(r==B) System.out.println("Object b collected");
            }
        }
    }
    
Holger
  • 285,553
  • 42
  • 434
  • 765
  • 100,000 bytes isn't very many these days. Are you sure you are forcing a GC here? – Bill Michell Aug 23 '13 at 16:19
  • 1
    Look closer at the loop. It’s doubling the size until it gets its OOME. Starting with a higher size raises the risk of being too high for the (unknown) configured heap space. This is not very efficient but forcing GC remains a demonstration/testing issue and should never be part of production code anyway. – Holger Aug 23 '13 at 17:22
2

The article states that:

... an efficient implementation of the JVM is unlikely to zero the reference when it goes out of scope

I think this happens because of situations like this:

public void doSomething() {  
    for(int i = 0; i < 10 ; i++) {
       String s = new String("boo");
       System.out.println(s);
    }
}

Here, the same reference is used by the "efficient JVM" in each declaration of String s, but there will be 10 new Strings in the heap if the GC doesn't kick in.

In the article example I think that the reference to foo keeps in the stack because the "efficient JVM" thinks that is very likely that another foo object will be created and, if so, it will use the same reference. Thoughts???

public void run() {
    try {
        Object foo = new Object();
        foo.doSomething();
    } catch (Exception e) {
        // whatever
    }
    while (true) { // do stuff } // loop forever
}

I've also performed the next test with profiling:

public class A {

    public static void main(String[] args) {
        A a = new A();  
        a.test4();
    }

    public void test1() {  
        for(int i = 0; i < 10 ; i++) {
           B b = new B();
           System.out.println(b.toString());
        }
        System.out.println("b is collected");
    }

    public void test2() {
        try {
            B b = new B();
            System.out.println(b.toString());
        } catch (Exception e) {
        }
        System.out.println("b is invisible");
    }

    public void test3() {
        if (true) {
            B b = new B();
            System.out.println(b.toString());
        }
        System.out.println("b is invisible");
    }

    public void test4() {
        int i = 0;
        while (i < 10) {
            B b = new B();
            System.out.println(b.toString());
            i++;
        }
        System.out.println("b is collected");
    }

    public A() {
    }

    class B {
        public B() {
        }

        @Override
        public String toString() {
            return "I'm B.";
        }
    }
}

and come to the conclusions:

teste1 -> b is collected

teste2 -> b is invisible

teste3 -> b is invisible

teste4 -> b is collected

... so I think that, in loops, the JVM doesn't create invisible variables when the loop ends because it's unlikely they will be declared again outside the loop.

Any Thoughts??

bruno conde
  • 47,767
  • 15
  • 98
  • 117
1

Would you really have that much code to analyse? Basically I can only see this being a significant problem for very long-running methods - which are typically just the ones at the top of each thread's stack.

I wouldn't be at all surprised if it's unfixed at the moment, but I don't think it's likely to be as significant as you seem to fear.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194