2

Why garbage collection in young generation is fast depends on a fact: most objects in young generation die soon. So when a young generation collection happens, few objects which are still alive are moved into old generation, and since then, all the data in young generation can be thought as useless(part or them is moved, the reset is unreachable), and this peace of memory can be re-used without further scanning. In this case, unreachable objects are not being scanned, which save a lot of time.

But I got a question, in Java there's a method finalize() (though it is deprecated in JAVA 9), if a collector guarantees that finalize() will get called, it needs to scan unreachable objects, not just go though the living objects, and the speed advantage seems disappeared.

So,

  1. I'm I right? (finalize() makes young generation collections much slower and the speed advantage disappeared.)
  2. If not, how JVM handled this problem? (For example, ignore finalize()?)
  3. Besides of the problem above, do young generation has other advantage?

Edit: I'm writing a gc for a language with finalize() feature, I just can't make the collection fast in this situation.

  • `finalize()` is invoked on an object before it is collected. It slows down the collection of every generation, not just the young generation. – Maurice Perry Nov 07 '19 at 08:37
  • @MauricePerry, so are you suggesting that young generation has other advantage? Could you explain more? –  Nov 07 '19 at 08:51
  • 4
    @ElliottFrisch the *name* garbage collector suggests that it doesn’t collect living objects, but the *definition* requires it to scan the live objects as the only way to recognize a dead object is by proving that it is not live, by traversing the graph of reachable objects. And generational garbage collectors *only* look at the live objects, as they relocate them and consider their original space (Eden/TLAB or survivor space) as empty, regardless of how many dead objects have been there previously. – Holger Nov 07 '19 at 09:14
  • you are talking about a _generational_ GC (`CMS` or `G1` for example), but not _all_ GC algorithms are like that; _Shenandoah_ is not for example. – Eugene Nov 22 '19 at 20:44
  • Another important advantage is that generational GC reduces the amount of objects that need to be scanned. Under ideal conditions young objects die fast but most objects are old. A traditional mark and sweep GC has to scan the full heap while a generational GC can focus on objects that are likely to be garbage. – Moritz Jan 26 '21 at 22:28

1 Answers1

1

Basically, you got it right. But survivor objects are not immediately copied to the old generation. Instead, they are copied to the Survivor space, where they have to survive a configurable number of garbage collections before they are promoted to the old generation.

But the fundamental assumption is correct. The efficiency is reduced when objects survive longer than necessary and having to invoke finalize() extends the object’s lifetime.

The fundamental fix is to make this an exceptional case. This is even addressed in the specification:

For efficiency, an implementation may keep track of classes that do not override the finalize method of class Object, or override it in a trivial way.

For example:

protected void finalize() throws Throwable {
    super.finalize();
}

We encourage implementations to treat such objects as having a finalizer that is not overridden, and to finalize them more efficiently, as described in §12.6.1.

In case of the HotSpot JVM, it recognizes when the method inherited by Object has not been overridden or when it has been overridden with an empty method. Afaik, a sole super call is not always recognized, but for your own language, it might be possible to recognize it.

So for the majority of objects in Java, finalize() doesn’t need to be invoked, as it has no effect anyway and is never invoked. This solves the problem of extended lifetime, as it now only applies to a few objects.

The need to scan dead objects is eliminated by having a special reference object that is only created for those objects having a non-trivial finializer, which is itself kept reachable. So it’s still only the reachable objects, which have to be scanned. More details can be found in this answer.

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765