1

One of the many issues with finalize methods in Java is the "object resurrection" issue (explained in this question): if an object is finalized, and it saves a copy of this somewhere globally reachable, the reference to the object "escapes" and you end up with a finalized but living object (that won't be finalized again, and otherwise is something of a problem).

In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object. In most cases, this achieves the goal of allowing the original object to be deallocated, rather than resurrected.

However, the Java garbage collector supports garbage collection of reference cycles; this means that an object can be finalized while (directly or indirectly) containing a reference to itself, and two objects can be finalized while (directly or indirectly) containing references to each other. In this case, the "copy all the fields into a new object" advice doesn't actually solve the problem; although we discard the this reference once the finalizer finishes running, the partially finalized object will be resurrected via the reference from the field. So we end up with the object being resurrected anyway.

In the case where the object indirectly holds a reference to itself, it's possible to recursively look through all the fields of the object until we find the self-reference (in which case we can replace it with a reference to the new object we're constructing), thus preventing the resurrection. So that solves the issue in that case.

However, if two objects hold references to each other (and thus both get deallocated at the same time), and we're creating a new instance of each, then each of the new objects will be holding a reference to the old, finalized object (rather than the new object that's been constructed as a replacement). This is obviously an undesirable state of affairs, so one thing I've been looking into is attempting to use the same solution as in the single-object case: recursively scanning the fields of the (living, newly constructed) objects looking for finalized objects, and replacing them with the corresponding replacement objects.

The problem is: how can I recognise a finalized/resurrected object, when I'm doing this? The obvious way to do this is to somehow record the identity of the finalized object in the finalizer, and then compare all the objects we find during the recursive scan with a list of finalized objects. The problem is, there doesn't seem to be a valid way to record the identity of the object in question:

  • A regular (strong) reference would hold the object alive, effectively resurrecting it automatically, and gives no method via which to determine that the object is not in fact referenced. This would solve the problem of identifying the resurrected objects, but comes with a problem of its own: although the resurrected objects would never be used, except for their identities, there would be no means via which to deallocate them (e.g. you can't use a PhantomReference to detect that the object is now truly dead, like you normally would in Java, because the object is now strongly reachable and thus the phantom reference never clears). So this would effectively mean that the objects in question stay allocated forever, causing a memory leak.
  • Using a weak reference was my first idea, but has the problem that at the time we construct the WeakReference object, the referenced object is not in fact strongly, softly, nor weakly reachable. As such, as soon as we store the WeakReference anywhere that's strongly reachable (to prevent the WeakReference itself being deallocated), the WeakReference's target becomes weakly reachable and the reference automatically clears. So we can't store any information that way.
  • Using a phantom reference has the problem that there's no way to compare a phantom reference with an object to see if that reference references that object. (Maybe there should be – unlike get(), which can resurrect an object, there's never any danger in this operation because we clearly have a reference to the object anyway – but it doesn't exist in the Java API. Likewise, .equals() on PhantomReference objects is ==, not value equality, so you can't use it to determine whether two phantom references reference the same thing.)
  • Using System.identityHashCode() to record a number corresponding to the object's identity almost works – deallocation of the object won't change the recorded number, the number won't prevent the object's deallocation, and resurrecting an object leaves the value the same – but unfortunately, being a hashCode, it's subject to collisions, so might have false positives in which an object appears to be resurrected when it isn't.
  • One final possibility is to modify the object itself to mark it as finalized (and track the location of its replacement), meaning that observing this mark on a strongly reachable object would reveal it as a resurrected object, but this requires adding an additional field to any object that might be involved in a reference cycle.

As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process". The approach I've been trying to use is "when an object that might potentially be involved in a cycle is finalized, keep track of that object's identity so that it can subsequently be replaced with its copy if it turns out to be reachable from another finalized object"; but none of the five approaches mentioned above seems satisfactory.

Is there some other way to keep track of finalized objects, so that they can be recognised if accidentally redirected? Is there an entirely different solution to the original problem, of safely making a copy of an object during its finalization?

smithaiw
  • 107
  • 6
  • *"In order to avoid the creation of resurrected objects, the normal advice [...]"* the normal advice is to not use finalizers. and if you must use finalizers to not resurrect objects. – the8472 Mar 26 '19 at 02:05
  • @the8472: This is the normal advice on *how* to not resurrect objects if you must use finalizers. Unfortunately, it doesn't quite cover all cases; this question is, in effect, asking whether the situation is fixable, or whether it's unsalvageable. (Note that the resulting functionality, "save a copy of an object as it looked when it's deallocated", is impossible using any means other than finalizers; if it's impossible using finalizers too, that would be somewhat problematic.) – smithaiw Mar 26 '19 at 02:08
  • No, you don't need to do any special contortions to avoid resurrecting objects. Just don't assign them to something reachable in the object graph that lives longer than the finalizer in the finalizer method and don't call a function that would do so. The linked answer is only for cases where you intentionally resurrect one object and want to avoid unintentionally resurrecting others. – the8472 Mar 26 '19 at 02:15
  • @the8472: That's incorrect; an object can get resurrected in a *different object's* finalizer, because it may be stored (directly or indirectly) in a field of that object; and that can happen even if the original object's finalizer just prints a string, or something similarly innocuous. Java will attempt to free (and thus finalize) reference cycles, and so some of the fields of an object that's currently being finalized may be other objects that are currently being finalized. This question is about how to deal with that case. – smithaiw Mar 26 '19 at 02:18
  • I edited the title of the question to make it clearer what the potentially buggy situation is. – smithaiw Mar 26 '19 at 02:27
  • @smithaiw given object has finalizer, it can only be resurrected by its own finalizer. Reachability decision is atomic. Whole object cycle eigth live, dead or resurrected. – Alexey Ragozin Mar 26 '19 at 07:51
  • 1
    @AlexeyRagozin and that whole object graph may consist of multiple objects having finalizers, in which case, it is entirely unspecified in which order the finalizers will run. They may even run concurrently. – Holger Mar 26 '19 at 08:34
  • @Holger finalizer is just a method called after fact. Resurrection happens during GC pause and it has atomic semantic in regards to all type of references (weak, soft, phantom). – Alexey Ragozin Mar 26 '19 at 15:04
  • @AlexeyRagozin in your [previous comment](https://stackoverflow.com/questions/55348631/is-it-possible-to-keep-track-an-object-from-its-finalizer-to-detect-accidental#comment97428134_55348631) you correctly used the term “resurrection” as something that could happen within the finalizer. I don’t know what you are talking about now, when you treat the same term as something that “happens during GC pause and … has atomic semantic”. Resurrection means storing the reference to a finalizer reachable object (which doesn’t have to be `this`) into a globally visible variable, e.g. a `static` field. – Holger Mar 26 '19 at 15:14
  • @Holger resurrection happens during GC. Assigning `this` to global variable (or letting it escape any other way). Once finalizer is called, `this` is already strongly reachable. Resurrection happens during GC if and only if unreachable object is still reachable though one or more final references. GC resurrects that object by modifying statically reachable finalizer queue and reference to it. – Alexey Ragozin Mar 26 '19 at 20:57
  • @AlexeyRagozin “resurrection” is not a well defined term. So you are are talking about the change from a finalizer-reachable object (graph) to strongly reachable and in this regard, you are right. But the question is about the finalizer’s activity of storing object references into global variables. Thanks for pointing at this difference. It helps understanding that any attempt to detect the reachability state is insufficient for detecting a finalizer’s wrong activity. – Holger Mar 27 '19 at 07:46

2 Answers2

3

In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object.

This is not the “normal advice”, not even the linked answer claims that. The linked answer starts with “If you absolutely must resurrect objects, …” which makes it pretty clear that this is not an advice on how “to avoid the creation of resurrected objects”.

The approach described in that answer is an object resurrection and ironically, it’s precisely the scenario, you describe as the problem you want to solve, a resurrection of objects (those referenced via the copied fields) by another object’s finalizer.

This keeps all but one of the problems associated with finalizers and with object resurrection. The only problem it solves, is that a finalized object won’t get finalized again, which is the smallest problem.

When an application abandons an object, it doesn’t have to be in a valid state. Objects only need to be kept in a valid state when they are intended to be used again. E.g. it is normal for an application to invoke close() on objects representing resources when done with them. But it’s also reasonable to abandon an object in the middle of an operation when an error occurs. The erroneous result state can be represented by a different object and the other, now-inconsistent object is not used.

A finalizer would have to deal with all these possible object states and even worse, with unusable object states caused by finalizers. As you recognized yourself, object graphs may get collected as a whole and all their finalizers get executed in an arbitrary order or even concurrently. So it doesn’t need loops and it doesn’t need resurrection attempts to get into trouble. When object A has a reference to object B and both have finalizers, an attempt of cleaning up A may fail when needing B in the process, as B may be already finalized or even in the middle of a concurrent finalization.

In short, finalization is not even suitable for the cleanup it was originally intended for. That’s why the finalize() method has been deprecated with Java 9.

Your attempt to reuse field values of an object under finalization is just adding fuel to the flames. Just think about the A→B scenario above. When A’s finalizer copies the field values to another object, it implies copying the reference to B and it doesn’t need an attempt by B’s finalizer to do the same. It’s already enough if B’s finalizer does what it is intended for, cleaning up associated resources, thus leaving B in an unusable state.

As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process".

As explained, “an object that’s currently being finalized” and “safely” is a contradiction in itself. It doesn’t need mutual attempts of reuse to break it. Even when looking on your original narrow problem statement only, all of your approaches have the problem that they do not even attempt to prevent the problem. They all only try to detect the problem at some arbitrary later time after the fact.

That said, there is no problem in comparing the referent of a WeakReference with some other strong reference, like weakReference.get() == someStrongReference. A weak reference only gets cleared when the referent has been garbage collected, which implies that it is impossible for the strong reference to point to it, so the answer false for comparing a null reference with someStrongReference would be the right answer then.

Holger
  • 285,553
  • 42
  • 434
  • 765
0

As the other answers indicate, trying to solve the underlying problem in this way is something that can't be accomplished, and something of a wider rethink is needed when trying to solve this sort of problem. This post describes the solution that I used to my problem, and how I got there.

Assuming that the goal is "keep track of what an object looked like at the time it became unreferenced", this can only safely be accomplished when the object itself has no finalizer (otherwise, there are a number of hard-to-solve problems, as described in the question, its comments, and the other answer). The only reason we actually need a finalizer here is that we can't otherwise get at the object after it's become unreferenced.

It's clearly a bad idea to allow the object to become unreferenced and then revive it from its finalizer. However, "reviving" an object with no finalizer is much less of a problem (as this is equivalent to the object never being deallocated at all – it doesn't end up "partially finalized" like an object with a finalizer would). This can be accomplished via creating a separate object with a finalizer, and intentionally creating a reference loop between the original object and the separate, finalizer-bearing object (which has just a finalizer and a reference t to the original object, nothing else); when the object becomes otherwise unreferenced, the finalizer on the new object will run, but the original object won't be deallocated and won't end up in any awkward finalization-related state.

The finalizer will, of course, have to break the loop (removing itself from the original object), in order to avoid resurrecting itself; if a new strong reference to the original object is created during finalization (cancelling its deallocation), the finalization object will therefore have to replace itself with a new finalization object (but this is easy to do, because it doesn't carry state, there's only one reference to it, and we know where that object is).

In conclusion: there is no safe way to keep an object alive during its own finalization, not even if you copy all its fields elsewhere: instead, you need to ensure that the object has no finalizer, and instead keep it alive using some other object's finalization.

smithaiw
  • 107
  • 6
  • As I tried to explain in my answer, being “partially finalized” is the least problem. In fact, objects without a finalizer (99.9% of all objects) are in that “partially finalized” state during their entire lifetime, as they can get garbage collected without a finalizer being invoked, just like objects whose finalizer already ran. Disallowing a finalizer for the resurrected object means, you’re simply forbidding the scenario of your question. If you can control the implementation to that degree, you can also ensure that no other (third) object exist trying the same resurrection trick. – Holger Mar 27 '19 at 08:03
  • But another thing to consider, is why you are thinking you need to resurrect an object at all. Whatever advantage you hope to gain, it may never materialize as there can be an arbitrary time span between an object’s last use and the execution of its finalizer, if the finalizer ever runs at all. Plus, special care is needed as elaborated in [Can java finalize an object when it is still in scope?](https://stackoverflow.com/q/24376768/2711488), with [dramatic consequences when failing](https://stackoverflow.com/q/26642153/2711488). – Holger Mar 27 '19 at 08:19
  • @Holger: I'm trying to see what an object looked like at the time it became unreferenced. As such, it doesn't really matter how much time happened since the object's last use, because the object's not going to change while it's unreferenced. – smithaiw Mar 27 '19 at 17:26
  • That sounds like a different scenario than in the question. The question looked like you were truly considering reusing the objects, as for mere debugging “how it looked like”, there would be no serious problems if there are similar debugging attempts in a cyclic graph. They would generate redundant reports, but not cause a real problem. But then, there’s also no reason to create a new object referencing the abandoned object. – Holger Mar 27 '19 at 17:36