2

Here is the code before my question. First there is an interface:

public interface SomeAction {
    public void doAction();
}

Then there are two classes:

public class SomeSubscriber {
    public static int Count;

    public SomeSubscriber(SomePublisher publisher) {
        publisher.subscribe(this);
    }

    public SomeAction getAction() {
        final SomeSubscriber me = this;
        class Action implements SomeAction {

            @Override
            public void doAction() {
               me.doSomething();
            }
        }

        return new Action();
    }

    // specify what to do just before it is garbage collected
    @Override
    protected void finalize() throws Throwable {
        SomeSubscriber.Count++;
        System.out.format("SomeSubscriber count: %s %n",  someSubscriber.Count);
    }

    private void doSomething() {
        // TODO: something
    }
}

The second class:

public class SomePublisher {
    private List<SomeAction> actions = new ArrayList<SomeAction>();

    public void subscribe(SomeSubscriber subscriber) {
        actions.add(subscriber.getAction());
    }
}

This is the code that is used to test the two classes:

public class Test {
    //output: "the answer is: 0" for the 1st run after compilation and running attemptCleanUp() first, stays 0 upon repeat run
    public static void main (String args []) {
        System.out. println("am in main()");
        SomePublisher publisher = new SomePublisher();
        for (int i = 0; i < 10; i++) {
            SomeSubscriber subscriber = new SomeSubscriber(publisher);
            subscriber = null;
        }
        attemptCleanUp();
   }

   //output: "the answer is: 0" for the 1st run after compilation and running attemptCleanUp() first, rising to 10, 20, 30 ...upon repeat run
    public static void answerIsNot0() {
        System.out. println("am in answerIsNot0()");
        SomePublisher publisher = new SomePublisher();
        for (int i = 0; i < 10; i++) {
            SomeSubscriber subscriber = new SomeSubscriber(publisher);
            subscriber = null;
        }
        attemptCleanUp();
   }

   private static void attemptCleanUp() {
        threadMessage("Before the gc attempt, the answer is: " + SomeSubscriber.Count);
        System.gc();
        System.runFinalization();
        threadMessage("After the gc attempt, the answer is: " + SomeSubscriber.Count);
   }

   private static void threadMessage(String message) {
        String threadName =
            Thread.currentThread().getName();
        System.out.format("%s: %s%n",
                          threadName,
                          message);
    }
}

The printout from main() shows SomeSubscriber.Count value of 1 to 10, while the final line produced The answer is: 0 ,like below:

am in main()
main: Before the gc attempt, the answer is: 0
SomeSubscriber count: 1 
SomeSubscriber count: 2 
SomeSubscriber count: 3 
SomeSubscriber count: 4 
SomeSubscriber count: 5 
SomeSubscriber count: 6 
SomeSubscriber count: 7 
SomeSubscriber count: 8 
SomeSubscriber count: 9 
SomeSubscriber count: 10 
main: After the gc attempt, the answer is: 0

whereas for answerIsNot0(), the number inside The answer is: <num> always matches the last number in the SomeSubscriber count: series.

My questions are: First, do the non-zero values show that the garbage collection indeed happened 10 times? This contradicts with the notion that the 10 subscriber s are all still referenced by the instances of the local class Action in the publisher instance, and therefore not subjected to garbage-collection. Secondly, how has the value of SomeSubscriber.Count changed at the final statement in main (String args []) {} method, but not at the answerIsNot0() method? In another word, why does the same code produce different effect on SomeSubscriber.Count when placed in main() as opposed to when placed inside answerIsNot0()?

Treefish Zhang
  • 1,131
  • 1
  • 15
  • 27

2 Answers2

2

First, there is a significant difference between garbage collection and finalization. Both may have implementation dependent behavior, which is intentionally unspecified, but at least, there’s a guaranty that the virtual machine will perform garbage collection in an attempt to reclaim memory, before an OutOfMemoryError is thrown.

Finalizers, on the other hand, are not guaranteed to run at all. Technically, finalization can only run after the garbage collector has determined that objects are unreachable and did enqueue them.

This implies that finalize() methods are not suitable to tell you whether the objects would get garbage collected under normal circumstances, i.e. if the class hadn’t a custom finalize() method.

Still, you seem to have hit a nail with your test, that raises the issue of reachability:

JLS, §12.6.1. Implementing Finalization

… A reachable object is any object that can be accessed in any potential continuing computation from any live thread.

It should be obvious that if there is no variable holding a reference to an object, no “potential continuing computation” can access it. That’s the easiest way to check this. Still, in your example, no potential continuing computation can access the publisher object, because there is no code performing any access to the variable. This is harder to detect and therefore doesn’t happen until the code gets optimized by the JVM anyway. §12.6.1 states explicitly:

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

See also “Can java finalize an object when it is still in scope?

This seems to be your issue. In a short-running program that doesn’t get maximally optimized, some unused objects referred by local variables may not get reclaimed immediately, whereas they might get reclaimed earlier with the same code, when it got deeper optimized after multiple runs. It’s not so important whether it is the main method or another method, it only matters, how often it is invoked or how long it runs (to be considered a hot spot), or more precisely, to which degree it will get optimized during the JVM’s lifetime.

Another issue with your code is related to the following:

JLS, §12.6. Finalization of Class Instances

The Java programming language does not specify which thread will invoke the finalizer for any given object.

It is important to note that many finalizer threads may be active (this is sometimes needed on large shared memory multiprocessors), and that if a large connected data structure becomes garbage, all of the finalize methods for every object in that data structure could be invoked at the same time, each finalizer invocation running in a different thread.

The Java programming language imposes no ordering on finalize method calls. Finalizers may be called in any order, or even concurrently.

As an example, if a circularly linked group of unfinalized objects becomes unreachable (or finalizer-reachable), then all the objects may become finalizable together. Eventually, the finalizers for these objects may be invoked, in any order, or even concurrently using multiple threads. If the automatic storage manager later finds that the objects are unreachable, then their storage can be reclaimed.

Since you are not taking any measure to ensure thread safe access to the variable SomeSubscriber.Count, a lot of inconsistencies can show up. Seeing a zero from the main thread even when it has been changed in a finalizer thread, is only one of them. You’ve been lucky that you have seen ascending numbers from one to ten, apparently there was only one finalizer thread in your JRE. Due to the lack of thread safety, you could have seen numbers in arbitrary order, but also some numbers occurring multiple times and others missing, not necessarily arriving at ten after the finalization of ten object at all.

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • Thank you for explaining to this beginner. If a lot of inconsistency can rise, why does running main() invariably give 0? Why have I never seen a case where the number is something else other than 0 with mian()? By '[I]t’s not so important whether it is the main method or another method,...to which degree it will get optimized during the JVM’s lifetime.', do you mean that main () has access to `better optimization` (i.e., some extra cleaning up)? Is that a known fact? – Treefish Zhang Feb 14 '17 at 02:23
  • Is the failure to ensure thread-safe access due to the fact that SomeSubscriber.Count was [declared as static](https://www.ibm.com/support/knowledgecenter/en/ssw_aix_61/com.ibm.aix.genprogc/writing_reentrant_thread_safe_code.htm)? – Treefish Zhang Feb 14 '17 at 02:32
  • 1
    No, this has nothing to do with `static`. The problem is that it is accessed by multiple threads and you declared it `static`, because you *want* to access it from multiple objects. You have to do either, guard every access to it with `synchronized` blocks or use an `AtomicInteger` (declaring it `volatile` is not sufficient here, as you want to *increment* it). The results depend on the environment, e.g. I don’t get the same result, but it is perfectly plausible that `attemptCleanUp()` reads `SomeSubscriber.Count` before the `finalize()` methods are executed in the other thread. – Holger Feb 14 '17 at 11:01
  • And no, the `main` method does not get any special treatment. For *typical use cases*, the results are usually the opposite: since the `main` method is invoked only once, it doesn’t get optimized at all, but even if it runs long enough to get optimized, it only benefits if the single execution gets transferred to the new code (“On Stack Replacement”). This isn’t as efficient as optimizing an often invoked method, where it is sufficient to let the next invocation end up in a fully optimized code. But as already said, that depends too many environmental factors to predict an outcome. – Holger Feb 14 '17 at 11:19
  • I am reading up on things like stacks and frames to process your answer. But at the mean time, can I take home with the message that my perceived pattern (i.e. main() always produced 0 at the end and answerIsNot0 did not) is not worth contemplating about because of " too many environmental factors to predict an outcome"? Although I was just running that one program: what other programs were in the 'environment'? – Treefish Zhang Feb 15 '17 at 15:38
  • 1
    Yes. you should only keep thinking about it if you like puzzles that are unlikely to be helpful in any other case. Some factors are the particular JVM version that might have some changes in the implementation, subtle timing aspects influencing the built-in profiler (driving optimization decisions), provided options selecting a garbage collection algorithm and specific gc options, thread scheduling and number of cores (determining the number of gc and finalizer threads and their execution timing) and any other running application may compete for CPU cores, the amount of available memory, etc… – Holger Feb 15 '17 at 17:10
1

The local class Action keeps a reference to SomeSubscriber (so you can call doSomething() on it. The instances of Action are reachable through SomePublisher. So the instances of SomeSubscriber are still reachable at the end of the main method.

That makes them not eligible for garbage collection. So they are not collected.

For the two different results, i assume you ran both methods after another. The answer 10 you are getting are the collected 10 instances from the first version to run. (As soon as the method ended SomePublisher was out of scope and could be collected, with it all references to Action and SomeSubscriber)

Additionally System.gc(); is just a hint that garbage collection should run, there is no guarantee that everything is collected after the method has run.

k5_
  • 5,450
  • 2
  • 19
  • 27
  • i just wanna mention that garbage collection is not guarenteed after a call to System.gc(); so you might get a different behaviour every time – Méhdi Màick Feb 11 '17 at 18:48
  • @ Méhdi Màick: Yes, running both methods produce different behavior; however, the results are predictable: running either main() or answerIsNot0() follows the two rules: 1.does not collect the garbage from the current run, 2. does always collect from the previous run. – Treefish Zhang Feb 11 '17 at 20:57
  • 1
    @TreefishZhang It can only collect when `SomePublisher` gets out of scope. Thats why it never collects on the "current run" but always for the "previous run" – k5_ Feb 11 '17 at 21:03
  • @k5_: I indeed run each method after fresh compilation when `SomeSubscriber.Count` is always '0' , and the difference is reproducible: that `answerIsNot0()` retains its `SomeSubscriber.Count` till the end, whereas `main()` always has `0` for `SomeSubscriber.Count` at the end. – Treefish Zhang Feb 11 '17 at 21:06
  • @TreefishZhang how do you execute `answerIsNot0()`? You need some `main(String[])` to trigger it. – k5_ Feb 11 '17 at 21:12
  • @k5_: Indeed I call answerIsNot0() directly, I guess because it is a static method. – Treefish Zhang Feb 11 '17 at 21:20