Is GC smart enough to remove objects that are referenced but no longer used?

Question

Let's say I have an object called "master" which owns 100 objects called "slave0" through "slave99" respectively. (This is not an array. I have 100 fields in my "master" class called salve0 to slave99 respectively.) Now, let's say my program first reads in a file which contained the serialized stored version of a "master" object.But, let's say my program never uses objects slave50 through slave99. What will happen? (My guess is that the java program will first read all 100 slave objects as part of the reading/deserialization process, and only after reading all 100 slave objects in, it might choose to do a GC, at which point objects slave50 through slave99 will get removed by the GC and the memory reclaimed. Is this right? NOTE: Object "master" is still being used, so technically, objects slave50 through slave99 are still being referenced by the parent object, master, and the parent object master is still being actively used.)

Follow-up question

So let's say my guess above is correct regarding how the GC works; what then happens if my long-running program spends say a few minutes processing objects slave0 through slave50, but then gets into another final (long-running) procedure named "X" that ONLY processes objects slave0 through slave25. Would the GC then realize that even though the objects slave25 through slave50 are still being reference by parent object master, and even though object master is still being used, the GC will still be smart enough to get rid of objects slave25 through slave50 since no one is going to ever use it from "procedure X" onwards?

As long as there is a reference, you object will not be picked up by GC. You can however set it to `null`. — Glains, Aug 01 '18 at 14:47
The GC doesn't analyze your program's logic. It only checks whether objects are reachable through hard references from anything in scope in any running thread of your program. — Ted Hopp, Aug 01 '18 at 14:51
The JVM can't know that an object which is not currently used might not be used in the future. — Peter Lawrey, Aug 01 '18 at 14:52
@Glains So: I can set it to null (and I presume if the object goes out of scope, eg it's only declared within a method and the method finishes and never return back the oject, that's the same thing as explicitly setting the object to null, right?). If that's true, then my current practice of declaring objects "final" (because I think that it's a safer way to code) is NOT always the most performant, since I should (if I want an object to be GCed within a method) set it to null when I'm done with it, right? (I never heard about that advice in Java, hence the question.) — Jonathan Sylvester, Aug 01 '18 at 15:14
@JonathanSylvester Not really, the GC normally takes care of everything. If a local variable goes out of scope, it's eligible for GC. Same goes for variables with no more references. It's just a possibility to explicitly make an object eligible for GC, you can have a final variable a class - If any object of that class is not referenced anymore, the value of that variable will get deleted as well. — Glains, Aug 01 '18 at 16:57

Karol Dowbecki · Answer 1 · 2018-08-01T14:55:50.670

In Java a GC won't remove a live object. When looking at a tracing GC logic, an object is considered live when it's reachable from an active thread (unless we are considering more exotic reference types e.g. WeakReference). In your simplistic example all fields in master object are reachable, since the master object itself is reachable from the main thread.

There are various articles you can read on tracing GC:

score 3 · Accepted Answer · answered Aug 02 '18 at 13:37

There is no simple answer to this. You say “Object ‘master’ is still being used”, but not, in which way. In principle, reading and writing fields of an object and even invoking methods on it can get optimized to not requiring the memory of an object.

Or, as the specification puts it:

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

Another example of this occurs if the values in an object's fields are stored in registers. The program may then access the registers instead of the object, and never access the object again. This would imply that the object is garbage.

As discussed in “finalize() called on strongly reachable object in Java 8”, this is more than a theoretical issue.

But the specification also says:

Note that this sort of optimization is only allowed if references are on the stack, not stored in the heap.

… (within the example) the inner class object should be reachable for as long as the outer class object is reachable.

Which implies that as long as your “master” object has references to “slave50” through “slave99”, they must be considered reachable as long as the “master” object is considered reachable, but in principle, it is allowed to collect them all together. But according to the same rules, even the still in use “slave0” through “slave25” could get collected then, if the optimized code is capable of running without accessing their memory again.

Note that since optimized code is intended to behave just like the original code, your program won’t notice the difference.

So, there are capabilities to detect unused objects, even if they “would naively be considered reachable”, but usually they depend on the optimization state of the method’s code, as the garbage collector does not analyze code, but the JVM’s optimizer does. In that regard, local variable scope is purely a compile-time thing. It may happen for unoptimized code, that the garbage collector sometimes considers a reference to be still existing, while the local variable is out of scope from the source code’s perspective. But more than often, it happens the other way round, unused local variables disappear in optimized code, even when being in scope from the source code’s perspective. In either case, returning from a method destroys the entire stack frame, including all local variables, thus you never need to set local variables to null before returning.

The best strategy is to never insert any explicit action to “help the garbage collector”, unless you encounter an actual problem with a scenario, the JVM can’t handle sufficiently. These are really rare.

Is GC smart enough to remove objects that are referenced but no longer used?

Follow-up question

2 Answers2

Linked