31

I've been looking into a bug in my code that seems to be caused by some "ugly" finalizer code. The code looks roughly like this

public class A {
   public B b = new B();
   @Override public void finalize() {
     b.close();
   }
}

public class B {
   public void close() { /* do clean up our resources. */ }
   public void doSomething() { /* do something that requires us not to be closed */ } 
}

void main() {
   A a = new A();
   B b = a.b;
   for(/*lots of time*/) {
     b.doSomething();
   }
}

What I think is happening is that a is getting detected as having no references after the second line of main() and getting GC'd and finalized by the finalizer thread - while the for loop is still happening, using b while a is still "in scope".

Is this plausable? Is java allowed to GC an object before it goes out of scope?

Note: I know that doing anything inside finalizers is bad. This is code I've inherited and am intending to fix - the question is whether I'm understanding the root issue correctly. If this is impossible then something more subtle must be the root of my bug.

Hearen
  • 7,420
  • 4
  • 53
  • 63
Michael Anderson
  • 70,661
  • 7
  • 134
  • 187
  • From the little I know, if a garbage collector GCs an in-scope object then there's a *serious* bug with it, because it's collecting objects that aren't garbage. So I'd guess it's something else, but I know comparatively little about these kinds of things, so it's entirely possible that I'm missing something... – awksp Jun 24 '14 at 00:55
  • You have named these classes as Outer and Inner. Is Inner truly an Inner class of outer? Or are you simply indicating that Outer has a reference to Inner (which is what the code example seems to show)? This is an important distinction, because if is a non-static inner class, Inner would have an implicit reference to Outer. – Brett Okken Jun 24 '14 at 00:57
  • The actual situation is more like a decorator - but the naming there is purely for containment - its not an "inner class" of outer. Probably should have used a different name. – Michael Anderson Jun 24 '14 at 00:59
  • It seems possible that an optimizing compiler can "tell" that `outer` is no longer needed, but I don't know what the exact language rules are. If you add `if (2+2==5) System.out.println(outer);` to the end of `main`, does it still exhibit the same behavior? Or you may need to change `2+2==5` to some other condition that you know must be false but the compiler may not be able to figure out. This may help rule out some possibilities. – ajb Jun 24 '14 at 01:00
  • @user3580294 While the object is in scope - there are no more references to it. So it _may_ be allowed for the GC to collect it. I'd not be surprised if the spec rules either way. – Michael Anderson Jun 24 '14 at 01:01
  • Sorry, seems I'm confusing some concepts here... Wouldn't an object be out of scope when there are no more references to it, by definition? – awksp Jun 24 '14 at 01:03
  • In your real situation, does `Inner` have a reference to `Outer`? – Sotirios Delimanolis Jun 24 '14 at 01:07
  • @SotiriosDelimanolis No Inner does not have a reference to Outer, only Outer has a reference to Inner. – Michael Anderson Jun 24 '14 at 01:09
  • I've renamed the classes `Outer` to `A` and `Inner` to `B`. Since the way they were named suggested that `B` was an inner class of `A`, which was not my intent. – Michael Anderson Jun 24 '14 at 01:42
  • 4
    A bit of terminology clarification: **scope** is a lexical construct of the language. It is related to, but ultimately distinct from, the **lifetime** of an object. The lifetime is governed by **reachability**: if no code path can reach the object then it is eligible to be collected. Often an object that is referenced by a variable that is in the current scope is considered reachable, but it doesn't have to be. – Daniel Pryden Jun 24 '14 at 01:57
  • @DanielPryden Ah, that makes sense. Thanks for clearing things up! – awksp Jun 24 '14 at 09:17
  • 4
    BTW: with Java 9 there is an explicite method to keep objects reachable till the end of a sope: http://download.java.net/java/jdk9/docs/api/java/lang/ref/Reference.html#reachabilityFence-java.lang.Object- `Reference.reachabilityFende(Object)` – eckes Nov 20 '16 at 21:10
  • 2
    @awksp no, lexical scope of the language/compiler has nothing to do with actual reachability. An object might not be used anymore even when it is in scope and therefore be collected, on the other hand typically objects are much longer reachable (by sticking around in stack slots) than the scope. But the later is a implementation details which one should not depend on (inlining and EA will change reachability effects of code). – eckes Nov 20 '16 at 21:12

2 Answers2

36

Can Java finalize an object when it is still in scope?

Yes.

However, I'm being pedantic here. Scope is a language concept that determines the validity of names. Whether an object can be garbage collected (and therefore finalized) depends on whether it is reachable.

The answer from ajb almost had it (+1) by citing a significant passage from the JLS. However I don't think it's directly applicable to the situation. JLS §12.6.1 also says:

A reachable object is any object that can be accessed in any potential continuing computation from any live thread.

Now consider this applied to the following code:

class A {
    @Override protected void finalize() {
        System.out.println(this + " was finalized!");
    }

    public static void main(String[] args) {
        A a = new A();
        System.out.println("Created " + a);
        for (int i = 0; i < 1_000_000_000; i++) {
            if (i % 1_000_000 == 0)
                System.gc();
        }
        // System.out.println(a + " was still alive.");
    }
}

On JDK 8 GA, this will finalize a every single time. If you uncomment the println at the end, a will never be finalized.

With the println commented out, one can see how the reachability rule applies. When the code reaches the loop, there is no possible way that the thread can have any access to a. Thus it is unreachable and is therefore subject to finalization and garbage collection.

Note that the name a is still in scope because one can use a anywhere within the enclosing block -- in this case the main method body -- from its declaration to the end of the block. The exact scope rules are covered in JLS §6.3. But really, as you can see, scope has nothing to do with reachability or garbage collection.

To prevent the object from being garbage collected, you can store a reference to it in a static field, or if you don't want to do that, you can keep it reachable by using it later on in the same method after the time-consuming loop. It should be sufficient to call an innocuous method like toString on it.

Hearen
  • 7,420
  • 4
  • 53
  • 63
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 3
    The pedantry aligns exactly with what I was aksing - your usage of scope matches mine, and my "`a` is getting detected as having no references" matches your "reachable". – Michael Anderson Jun 24 '14 at 07:21
  • 6
    OK, if pedantry matches, so much the better. It pays to be pedantic in this business. :-) Some of the commenters (with the notable exception of Daniel Pryden) mix up scope with reachability, or assume that being in-scope implies reachability. I've even heard people familiar with this phenomenon say that a local variable goes out of scope after its last use in a block, which is certainly not true. – Stuart Marks Jun 24 '14 at 07:45
  • "Some of the commenters" would primarily be me, I think. Just want to say thank you for clearing things up for me; seems I wasn't pedantic enough in keeping those concepts distinct. Great answer, too! Looks like I have a long way in understanding everything that's going on... – awksp Jun 24 '14 at 09:19
  • +1, this is a great answer. I was on my phone when I commented earlier, or else I would have written a more comprehensive answer. Glad to see it got written anyway. :-) Related reading (about .NET, not Java, although the result is similar): [When does an object become eligible for garbage collection?](http://blogs.msdn.com/b/oldnewthing/archive/2010/08/10/10048149.aspx) – Daniel Pryden Jun 24 '14 at 20:57
  • 1
    @DanielPryden Thanks. Actually I hadn't seen your comment (it was hidden beneath "show more comments") at the time I wrote my answer. But you know what they say, you schmooze, you lose! :-) – Stuart Marks Jun 25 '14 at 00:30
  • 3
    The `toString()` statement raises my doubts. Since `toString()` implementations are usually side-effect-free, the call can be removed entirely, if its result is not used and JLS §12.6.1 suggests that such optimizations are allowed to cause earlier collection. Note further, that in absence of synchronization, even using the result like in the example’s print statement, does not guaranty a longer life time. Without *happens-before* relationships, the finalizer thread might see a different ordering of the main thread’s action, i.e. completion of the `toString()` call before the loops completion. – Holger Sep 14 '16 at 16:42
  • 4
    @Holger Strictly speaking you're correct, `toString()` doesn't provide full guarantees preventing GC. However, it should be good enough for most purposes. In Java 9 there is a new API that does provide such a guarantee: [`Reference.reachabilityFence`](http://download.java.net/java/jdk9/docs/api/java/lang/ref/Reference.html#reachabilityFence-java.lang.Object-). – Stuart Marks Jul 05 '17 at 21:59
  • "But really, as you can see, scope has nothing to do with reachability or garbage collection." One would hope/assume that, at least, a variable that goes out of scope becomes unreachable, and it seems regrettable that even this is not guaranteed as late as Java 10 (see [JDK-8175883](https://bugs.openjdk.java.net/browse/JDK-8175883)). – RFST May 20 '18 at 03:11
  • 2
    @RFST it never was and still isn’t guaranteed. All that the specific bug report addressed, are the invisible variables of a for-each loop, holding iterators which may prevent the collection from being garbage collected. As this answer demonstrates, this usually does not apply to optimized code. Actually, I consider it strange, that now `javac` inserts extra instructions into every code, to address that single, very specific corner case, when the general behavior (dangling local variables still may exist) didn’t change. – Holger Sep 18 '18 at 06:15
11

JLS §12.6.1:

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

So yes, I think it's allowable for a compiler to add hidden code to set a to null, thus allowing it to be garbage-collected. If this is what's happening, you may not be able to tell from the bytecode (see @user2357112's comment).

Possible (ugly) workaround: Add public static boolean alwaysFalse = false; to the main class or some other classes, and then at the end of main(), add if (alwaysFalse) System.out.println(a); or something else that references a. I don't think an optimizer can ever determine with certainty that alwaysFalse is never set (since some class could always use reflection to set it); therefore, it won't be able to tell that a is no longer needed. At the least, this kind of "workaround" could be used to determine whether this is indeed the problem.

Hearen
  • 7,420
  • 4
  • 53
  • 63
ajb
  • 31,309
  • 3
  • 58
  • 84
  • 4
    You can't tell from the bytecode, because the transformation likely happens in the JIT optimization. – user2357112 Jun 24 '14 at 01:09
  • 2
    side note: mind that it is not `public static **final** boolean alwaysFalse` otherwise it would be a _compile-time constant_ and therefore inlined at compile time (the compiler would emit no code for the `if` block, per the JLS) – ignis Jul 06 '15 at 07:15
  • 3
    An optimizer doesn’t need to recognize that `alwaysFalse` is *never* set, all it needs to know is that it is not set between the construction of `A` and the test, in program order. Since the variable is not declared `volatile`, the optimized code is not required to notice updates made by other threads, whether directly or via Reflection. But even if it was declared `volatile` we should never write code settling on an assumed inability of the optimizer. – Holger Sep 14 '16 at 16:19
  • 1
    @Holger yeah, that's why I included the words "possible" and "ugly" in my answer... – ajb Sep 15 '16 at 03:17
  • 5
    It is reliable to use `Reference.reachabilityFence(a)` in Java9 at the end of main, but it is still ugly if something like this is needed. – eckes Nov 20 '16 at 21:16