What is resurrection in Garbage Collection? How Does Phantom References Solve It? Is there any practical example of it?

Question

I have come across this term (resurrection) while studying about different type of references. One of the confusing area in for me is Phantom Reference. Till now I have never come across of a use case in reality where in the hindsight I feel i should have used Phantom Reference.

While searching for the use cases I found Where Phantom Reference prevents from objects being resurrected.

To make it clear I understand the definition of both Object "resurrection" in finalize and phantom reference

Where I am having trouble is to find a "real" use case

When To use Object Resurrection?
When To Use Phantom Reference?
How does phantom reference solve an inadvertent object resurrection

I would really appreciate a discussion around this topics. These are few areas which are still hazy to me

Thanks, Abhijit

http://stackoverflow.com/questions/1002567/reference-to-object-during-finalize — NPE, Mar 25 '13 at 06:42
http://stackoverflow.com/questions/9826741/when-to-use-phantom-references-in-java — NPE, Mar 25 '13 at 06:44
http://stackoverflow.com/questions/1599069/have-you-ever-used-phantom-reference-in-any-project — NPE, Mar 25 '13 at 06:45

score 0 · Answer 1 · answered Mar 26 '19 at 18:57

Java has two different mechanisms for reacting to the deallocation of an object. The older mechanism, using finalize, runs a particular method just before the object is deallocated. The new mechanism, using PhantomReference, instead allows you to run a particular method just after the object is deallocated.¹

The finalize technique is more powerful, because you have access to the object's this at the time it's deallocated; but also much more dangerous, because it's possible to (intentionally or accidentally) create a new reference to the object in a finalizer. This can either be done directly (e.g. by assigning this to static field), or indirectly (e.g. two objects become unreferenced at the same time, one references the other, and thus a finalized object ends up being accessed indirectly via a field of a different object's finalizer). This situation, where an object ends up finalized but nonetheless reachable from somewhere, is known as an object resurrection; and although it has defined semantics², they tend to be fairly problematic semantics in practice, and are normally treated as equivalent to undefined behaviour.

The PhantomReference method of reacting to an object deallocation is basically a constrained form of finalization that prevents you from making any mistakes via not giving you the tools with which to do so: the object is already (effectively) deallocated by the time you react to the deallocation, so you have no chance to accidentally resurrect it or any other objects that were deallocated at the same time. (In particular, the PhantomReference has no access to the object's this pointer; PhantomReference#get always returns null.) Phantom references also have other advantages, e.g. the API allows you precise control over what thread the finalizer runs on and what it will be doing at the time.

So why would you use a phantom reference? Basically, any situation where you'd want to react to an object's deallocation should use PhantomReference if it can, because it uses an API that prevents a wide range of common mistakes with finalizers. finalize (which is now deprecated) should be reserved only for situations where there's really no other option.

Unfortunately, the API for PhantomReference, despite being much harder to misuse than that for finalize, is also much harder to use in general:

You need an object to hold the PhantomReference itself; the PhantomReference will only fire if this object is still alive. For example, if you want to remove metadata about an object from a map when that object dies, it would make sense to store the PhantomReference as/within a separate field in the same object that implements the map (this is how WeakMap actually works in Java). If you're using finalizers to manage global resources like filehandles, the PhantomReference will thus need to be held alive by some global structure (e.g. a collection in a static field of some class).
You need a ReferenceQueue that handles scheduling of the finalizers.
You need a method that does your finalization work – something that will run when the object you're monitoring is deallocated. PhantomReference doesn't have direct provision for one of these; the usual technique is to extend PhantomReference and give the resulting derived class the method in question.
You need to poll the reference queue; this is the operation that specifies which thread the finalizers run on and what it's doing at the time. Possibilities include using a separate thread for reference queue polling, or using the main thread of your program at a time when it's not doing anything important (e.g. just before it starts blocking on input).
Polling the reference queue doesn't actually run the finalizer (after all, PhantomReference doesn't have direct provision for a method that's run upon finalisation). Rather, polling the reference queue just gives you the PhantomReference object that saw a deallocation. As the PhantomReference class itself has no useful methods for reacting to this, you'll need to cast it to the appropriate class, then run the method you created.
Polling the reference queue also doesn't deallocate the PhantomReference object (which you have to be holding alive in some other object; otherwise it wouldn't work). So when you see a phantom reference dequeue, if you want to avoid a memory leak, you have to manually remove it from whatever was holding it alive (normally a collection).
If you need further information about the object that was deallocated (e.g. in the case of a WeakMap, this would be a reference to the map entry that needs removal), you'll have to store it somewhere, because it won't otherwise be available when the object is deallocated. Typically you'd store the data in the PhantomReference itself (seeing as you're using a derived class of it anyway, you can create fields in the derived class to store the data). Bear in mind that this cannot reference the object of which you want to react to the deallocation, because otherwise you'll end up holding the object alive and it'll never get deallocated.

Although it's possible to deal with all this complexity yourself, and is occasionally necessary if you want to do something unusual with phantom references, it would be far more common to use a pre-written library that wraps up the required operations into a more convenient API. For example, java.lang.ref.Cleaner uses phantom references internally to provide an API more similar to that of finalize, but (because it's based on phantom references) one that's safe against accidental resurrections and similar issues. As such, although phantom references tend to be quite useful in general for reacting to objects becoming unreachable, it's rare for programmers to actually deal with them directly; using a library that uses them internally would be much more common.

¹ In older versions of Java, the phantom reference technique actually held the memory for the object allocated until after the phantom reference was cleared; but this was just an implementation detail, because there was no way to access the memory in question, and objects should be treated as though they were already deallocated from a phantom reference handler, as you can't do anything with them anyway.

² A resurrected object remains allocated until it becomes unreachable, at which point it's deallocated without running its finalizer, unless it gets resurrected a second time by the finalizer of some other object that got deallocated at the same time. Code that relies on this behaviour is probably broken.

What is resurrection in Garbage Collection? How Does Phantom References Solve It? Is there any practical example of it?

1 Answers1