One of the many issues with finalize
methods in Java is the "object resurrection" issue (explained in this question): if an object is finalized, and it saves a copy of this
somewhere globally reachable, the reference to the object "escapes" and you end up with a finalized but living object (that won't be finalized again, and otherwise is something of a problem).
In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object. In most cases, this achieves the goal of allowing the original object to be deallocated, rather than resurrected.
However, the Java garbage collector supports garbage collection of reference cycles; this means that an object can be finalized while (directly or indirectly) containing a reference to itself, and two objects can be finalized while (directly or indirectly) containing references to each other. In this case, the "copy all the fields into a new object" advice doesn't actually solve the problem; although we discard the this
reference once the finalizer finishes running, the partially finalized object will be resurrected via the reference from the field. So we end up with the object being resurrected anyway.
In the case where the object indirectly holds a reference to itself, it's possible to recursively look through all the fields of the object until we find the self-reference (in which case we can replace it with a reference to the new object we're constructing), thus preventing the resurrection. So that solves the issue in that case.
However, if two objects hold references to each other (and thus both get deallocated at the same time), and we're creating a new instance of each, then each of the new objects will be holding a reference to the old, finalized object (rather than the new object that's been constructed as a replacement). This is obviously an undesirable state of affairs, so one thing I've been looking into is attempting to use the same solution as in the single-object case: recursively scanning the fields of the (living, newly constructed) objects looking for finalized objects, and replacing them with the corresponding replacement objects.
The problem is: how can I recognise a finalized/resurrected object, when I'm doing this? The obvious way to do this is to somehow record the identity of the finalized object in the finalizer, and then compare all the objects we find during the recursive scan with a list of finalized objects. The problem is, there doesn't seem to be a valid way to record the identity of the object in question:
- A regular (strong) reference would hold the object alive, effectively resurrecting it automatically, and gives no method via which to determine that the object is not in fact referenced. This would solve the problem of identifying the resurrected objects, but comes with a problem of its own: although the resurrected objects would never be used, except for their identities, there would be no means via which to deallocate them (e.g. you can't use a
PhantomReference
to detect that the object is now truly dead, like you normally would in Java, because the object is now strongly reachable and thus the phantom reference never clears). So this would effectively mean that the objects in question stay allocated forever, causing a memory leak. - Using a weak reference was my first idea, but has the problem that at the time we construct the
WeakReference
object, the referenced object is not in fact strongly, softly, nor weakly reachable. As such, as soon as we store theWeakReference
anywhere that's strongly reachable (to prevent theWeakReference
itself being deallocated), theWeakReference
's target becomes weakly reachable and the reference automatically clears. So we can't store any information that way. - Using a phantom reference has the problem that there's no way to compare a phantom reference with an object to see if that reference references that object. (Maybe there should be – unlike
get()
, which can resurrect an object, there's never any danger in this operation because we clearly have a reference to the object anyway – but it doesn't exist in the Java API. Likewise,.equals()
onPhantomReference
objects is==
, not value equality, so you can't use it to determine whether two phantom references reference the same thing.) - Using
System.identityHashCode()
to record a number corresponding to the object's identity almost works – deallocation of the object won't change the recorded number, the number won't prevent the object's deallocation, and resurrecting an object leaves the value the same – but unfortunately, being ahashCode
, it's subject to collisions, so might have false positives in which an object appears to be resurrected when it isn't. - One final possibility is to modify the object itself to mark it as finalized (and track the location of its replacement), meaning that observing this mark on a strongly reachable object would reveal it as a resurrected object, but this requires adding an additional field to any object that might be involved in a reference cycle.
As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process". The approach I've been trying to use is "when an object that might potentially be involved in a cycle is finalized, keep track of that object's identity so that it can subsequently be replaced with its copy if it turns out to be reachable from another finalized object"; but none of the five approaches mentioned above seems satisfactory.
Is there some other way to keep track of finalized objects, so that they can be recognised if accidentally redirected? Is there an entirely different solution to the original problem, of safely making a copy of an object during its finalization?