7

For a library that involves asynchronous operations, I have to keep a reference to an object alive until a certain condition is met.

(I know, that sounds unusual. So here is some context, although it may not strictly be relevant: The object may be considered to be a direct ByteBuffer which is used in JNI operations. The JNI operations will fetch the address of the buffer. At this point, this address is only a "pointer" that is not considered as a reference to the byte buffer. The address may be used asynchronously, later in time. Thus, the buffer has to be prevented from being garbage collected until the JNI operation is finished.)

To achieve this, I implemented a method that is basically equivalent to this:

private static void keepReference(final Object object)
{
    Runnable runnable = new Runnable()
    {
        @SuppressWarnings("unused")
        private Object localObject = object;

        public void run()
        {
            // Do something that does NOT involve the "localObject" ...
            waitUntilCertainCondition();

            // When this is done, the localObject may be garbage collected
        }
    };
    someExecutor.execute(runnable);
}

The idea is to create a Runnable instance that has the required object as a field, throw this runnable into an executor, and let the runnable wait until the condition is met. The executor will keep a reference to the runnable instance until it is finshed. The runnable is supposed to keep a reference to the required object. So only after the condition is met, the runnable will be released by the executor, and thus, the local object will become eligible for garbage collection.

The localObject field is not used in the body of the run() method. May the compiler (or more precisely: the runtime) detect this, and decide to remove this unused reference, and thus allow the object to be garbage collected too early?

(I considered workarounds for this. For example, using the object in a "dummy statement" like logger.log(FINEST, localObject);. But even then, one could not be sure that a "smart" optimizer wouldn't do some inlining and still detect that the object is not really used)


Update: As pointed out in the comments: Whether this can work at all might depend on the exact Executor implementation (although I'd have to analyze this more carefully). In the given case, the executor will be a ThreadPoolExecutor.

This may be one step towards the answer:

The ThreadPoolExecutor has an afterExecute method. One could override this method and then use a sledgehammer of reflection to dive into the Runnable instance that is given there as an argument. Now, one could simply use reflection hacks to walk to this reference, and use runnable.getClass().getDeclaredFields() to fetch the fields (namely, the localObject field), and then fetch the value of this field. And I think that it should not be allowed to observe a value there that is different from the one that it originally had.

Another comment pointed out that the default implementation of afterExecute is empty, but I'm not sure whether this fact can affect the question of whether the field may be removed or not.

Right now, I strongly assume that the field may not be removed. But some definite reference (or at least more convincing arguments) would be nice.


Update 2: Based on the comments and the answer by Holger, I think that not the removal of "the field itself" may be a problem, but rather the GC of the surrounding Runnable instance. So right now, I assume that one could try something like this:

private static long dummyCounter = 0;
private static Executor executor = new ThreadPoolExecutor(...) {
    @Override
    public void afterExecute(Runnable r, Throwable t) {
        if (r != null) dummyCounter++;
        if (dummyCounter == Long.MAX_VALUE) {
            System.out.println("This will never happen", r);
        }
    }
}

to make sure that the localObject in the runnable really lives as long as it should. But I can hardly remember ever having been forced to write something that screamed "crude hack" as loud as these few lines of code...

Community
  • 1
  • 1
Marco13
  • 53,703
  • 9
  • 80
  • 159
  • 3
    If the JNI code uses that buffer, it should keep a reference to that buffer. Everything else is a kludge. – Holger Dec 14 '16 at 16:16
  • Besides, “the runnable will be released by the executor” makes no sense. Executors don’t lock runnables. If you think of references, the executor doesn’t need to hold a reference to the `Runnable`, after the execution of the `run()` method has started. – Holger Dec 14 '16 at 16:19
  • @Holger Keeping a reference to tbe buffer was the goal of all this. The question is whether the sketched `keepReference` method is sufficient to achieve this. Regarding the second comment: I'd have to think twice about whether this is true or not, but here one can assume that it is a `ThreadPoolExecutor` which in any case will have to keep some sort of reference until its `afterExecute` method is called. However, this may be one step towards the answer - I'll add an EDIT – Marco13 Dec 14 '16 at 16:31
  • The default `afterExecute` implementation is a no-op, so that doesn’t guaranty a longer lifetime of that the `Runnable`. – Holger Dec 14 '16 at 16:34
  • @Holger I added the update. I'm not sure whether the fact that `afterExecute` is empty may affect the behavior of the JVM in this regard. But maybe someone can point out some other evidence or convincing arguments. – Marco13 Dec 14 '16 at 16:44
  • Storing an object's reference value in a variable does not prevent the object from being garbage collected. The JVM looks for accesses in potential continuing computation from all threads. If the JVM can detect that the object referenced by `localObject` won't be accessed further in `run` (and assuming the caller of `keepReference` doesn't use it either), then it will be considered eligible for garbage collection. – Sotirios Delimanolis Dec 14 '16 at 16:54
  • @SotiriosDelimanolis Does this even apply for the obscure case where the reference is accessed using reflection hacks in an overridden method in an "unrelated" class (i.e. `afterExecute` of the executor) ? This does not involve a "hard" reference (only indirectly via the reflection `Field`), and I can hardly imagine that the runtime can actually detect whether "something like this" will happen "somewhere". Can you give any further references supporting this statement? – Marco13 Dec 14 '16 at 17:10
  • @Sotirios Delimanolis: it’s not so easy. For heap variables like `localObject`, there are restrictions, i.e. it must not be collected before the `Runnable` instance itself. But even the runnable itself can be collected if not touched subsequently, so in that case, both might be collected together. – Holger Dec 14 '16 at 17:14
  • @Holger I see, I had not made that distinction. – Sotirios Delimanolis Dec 14 '16 at 17:39
  • 2
    I'm more familiar with .NET than Java - but don't some Java GCs implement compaction? So even if the object reference is *live* and prevented from being collected, isn't there still a risk that it could be *moved*? (And thus invalidate the pointer) – Damien_The_Unbeliever Dec 15 '16 at 07:06
  • @Hulk - but in this context, they're talking about something else having what I perceived as a pointer/untracked reference to the object (if it had some kind of tracked reference that the GC was aware of, there wouldn't be a question here, would there?) – Damien_The_Unbeliever Dec 15 '16 at 09:09
  • 2
    @Damien_The_Unbeliever: this question is about a *direct* `ByteBuffer`, which encapsulates a region of non-movable memory, which is especially designed to be accessed by `native` code, usually I/O operations. For ordinary objects, you are right, depending on the chosen GC algorithm, there might be compaction or copying. Or even both within the same JVM, depending on the generation, to which the object belongs. – Holger Dec 15 '16 at 19:22

2 Answers2

4

If JNI code fetches the address of a direct buffer, it should be the responsibility of the JNI code itself, to hold a reference to the direct buffer object as long as the JNI code holds the pointer, e.g. using NewGlobalRef and DeleteGlobalRef.

Regarding your specific question, this is addressed directly in JLS §12.6.1. Implementing Finalization:

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. …

Another example of this occurs if the values in an object's fields are stored in registers. … Note that this sort of optimization is only allowed if references are on the stack, not stored in the heap.

(the last sentence matters)

It is illustrated in that chapter by an example not too different to yours. To make things short, the localObject reference within the Runnable instance will keep the life time of the referenced object at least as long as the life time of the Runnable instance.

That said, the critical point here is the actual life time of the Runnable instance. It will be considered definitely alive, i.e. immune to optimizations, due to the rule specified above, if it is also referred by an object that is immune to optimizations, but even an Executor isn’t necessarily a globally visible object.

That said, method inlining is one of the simplest optimizations, after which a JVM would detect that the afterExecute of a ThreadPoolExecutor is a no-op. By the way, the Runnable passed to it is the Runnable passed to execute, but it wouldn’t be the same as passed to submit, if you use that method, as (only) in the latter case, it’s wrapped in a RunnableFuture.

Note that even the ongoing execution of the run() method does not prevent the collection of the Runnable implementation’s instance, as illustrated in “finalize() called on strongly reachable object in Java 8”.

The bottom line is that you will be walking on thin ice when you try to fight the garbage collector. As the first sentence of the cite above states: “Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable.” Whereas we all may find ourselves being thinking too naively…

As said at the beginning, you may rethink the responsibilities. It’s worth noting that when your class has a close() method which has to be invoked to release the resource after all threads have finished their work, this required explicit action is already sufficient to prevent the early collection of the resource (assuming that the method is indeed called at the right point)…

Community
  • 1
  • 1
Holger
  • 285,553
  • 42
  • 434
  • 765
  • "*responsibility of the JNI code*": This is not possible. The JNI code only passes the address to a native library (OpenCL, particularly), and *this* one is doing the async work. "*life time of the ... instance*": The executor is `static final`, so this is fine. "*execute/submit*" : I mixed this up, you're right. "*ongoing execution*": The linked thread is very interesting in this regard! The responsibilities are a bit difficult. Maybe I can do some trickery with overriding `afterExecute`, but it's thin ice, indeed.... – Marco13 Dec 14 '16 at 17:21
  • After reading the linked specification again, you may get away with adding an `@Override protected synchronized void finalize() {}` to your `Runnable` and a `synchronized(r) { }` to `afterExecute` (`r` being the runnable parameter) to enforce an ordering between `afterExecute` and the collection of the runnable. I’m not sure whether empty `synchronized` blocks can be elided, but you can insert an `if(r.localObject==null) throw new AssertionError();` But really…shouldn’t there be an operation waiting for the completion of the asynchronous native operation? – Holger Dec 14 '16 at 17:29
  • Just to be sure, `waitUntilCertainCondition()` implies that you *can* wait for the completion, so you have a JNI function which triggers the asynchronous operation (which could invoke `NewGlobalRef`) and another JNI function which will wait for the completion of the native operation (which could invoke `DeleteGlobalRef`)? It must be that way as otherwise, you wouldn’t be so confident that garbage collection after that point would be ok… – Holger Dec 14 '16 at 17:34
  • (I still have to re-read the pointer that you gave to the spec, so can't say anything about the `finalize/synchronized` points yet). (BTW: I added another "Update" in the question, sketching a crude hack that might work as well). Regarding the waiting: The address is passed to the OpenCL implementation, and this manages and schedules the operations on this. (cont'd...) – Marco13 Dec 14 '16 at 17:38
  • ... There *are* mechanisms for notifications about the state of these operations (`cl_event` objects). And they are used in the sketched `waitUntil...` method. But it would be *very* fiddly and possibly fragile to use them for the object lifetime management: Sneaking the required `New/DeleteGlobalRef` calls into the JNI layer would not be entirely trivial, but of course, something that I consider, as the last resort, if there is no other ("simple") way to keep an object alive. – Marco13 Dec 14 '16 at 17:41
  • *finalize() called on strongly reachable object in Java 8* - that's got to be a bug, let's not sugar coat it. – ZhongYu Dec 14 '16 at 17:42
  • @Zhong Yu: don’t judge by the title. It all boils down to the definition of “strong reference” and the questioner had a different idea than the JVM developers. Of course, it was *not* a strongly reachable object that had been collected. – Holger Dec 14 '16 at 17:45
  • @Marco13: well, the whole code assumes that the `waitUntil…` method is sufficient as otherwise the life time of the runnable might be too short anyway. If that `waitUntil…` is a native method and an instance method of the runnable or receives the runnable as a parameter, you wouldn’t even need additional efforts as that would keep a reference until `waitUntil…` returns. – Holger Dec 14 '16 at 17:48
  • @Holger - it doesn't matter; there is no way, e.g. a local FileInputStream can be finalized while it is being read. Any parsing of words or agressive optimization that lead to that possibility is unacceptable. – ZhongYu Dec 14 '16 at 17:49
  • While VM can, and should, aggressively GC ordinary objects, it must be careful with objects with finalize() or phantom references. The timing cannot be messed up or all bets are off. Who knows. Can we still trust Oracle's VM? – ZhongYu Dec 14 '16 at 17:52
  • From the section that you pointed to: "For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner." - **this** looks like an answer to the original question, but as we see, the problem is broader. The `waitUntil...` involves native calls, but not related to the runnable. BTW: If you're interested, some context: https://raw.githubusercontent.com/gpu/JOCL/master/src/main/java/org/jocl/CL.java (warning: large), method `scheduleReferenceRelease` – Marco13 Dec 14 '16 at 17:54
  • @Zhong Yu: it can’t happen for an ordinary `FileInputStream` as it uses proper synchronization. It was the improper finalizer of *a different class* that called `close()` on the `InputStream` explicitly. The `InputStream` can’t do anything against explicit `close` calls. In fact, the `InputStream` wasn’t finalized at this point. It’s important to understand that finalizers can be called by arbitrary threads (that was always specified that way), so proper synchronization is unavoidable. – Holger Dec 14 '16 at 17:54
  • @Marco13: this is only an example to the general statement I have cited. Usually, the code generator won’t set local variables to `null`, but rather generate code that doesn’t have the unused variables in the first place. The variables and the variable being `null` are only mental models for us programmers. By the way, your `afterExecute` work-around is unlikely to help. It’s very likely that the optimizer can prove that the `ThreadPoolExecutor` will never pass a `null` runnable to `afterExecute`, so `r!=null` can get optimized away for this code path. – Holger Dec 14 '16 at 18:03
  • I was using FileInputStream as a standalone example, unrelated to the original question. Forget it, new example -- a simple local code block, allocate a direct ByteBuffer, and read/write to it. Obviously, this ByteBuffer object must not be GC-ed while read/write is happening. VM has to (if it behaves correctly) notice that there is a side effect associated with GCing the object. -- On the other hand, a non-direct ByteBuffer, just containing a `byte[]` field, can be optimized away by VM. – ZhongYu Dec 14 '16 at 18:06
  • The workaround can also be changed to a simple `if (dummyCounter++ == Long.MAX_VALUE) throw new AssertionError("Here: "+r);`. This was only a sketch of the contortions that may be necessary. – Marco13 Dec 14 '16 at 18:10
  • @Marco13 - I believe you are correct in OP - it should prevent GC. But if there is any doubt, you can retain the object in a static pool -- no optimizer, however smart or stupid it is, can GC that! – ZhongYu Dec 14 '16 at 18:11
  • @Zhong Yu: direct buffers are very special. First, their formal declaration consists of `native` method invocations for every access, second, the JVM will handle them as intrinsic ops instead, most likely. Besides, I don’t think that it is useful to discuss that other question in comments to *this* question. If you have an opinion contradicting the answers there, why bury it in the comments *here*? – Holger Dec 14 '16 at 18:14
  • @Marco13: it might have been a sketch only, but it shows, how fragile the whole thing is. You have to pray that not the next developer, not understanding these subtil aspects, changes something. – Holger Dec 14 '16 at 18:16
  • it's not that special; the cleanup is done by PhantomReference (though with some VM specific shortcut). In any case, as long as it is strongly reachable, it cannot be deallocated. The developer must be able to reason about it through superfacial analysis of strong-reachability, because we cannot reason about deep, complex, idiosyncratic optimization steps. – ZhongYu Dec 14 '16 at 18:22
  • @Zhong Yu: maybe they use a `PhantomReference`, because a naive `finalize()` method wouldn’t work (without making every op expensively `synchronized`)? But I don’t get, what you are trying to reason here. Apparently, `InputStream`s and `ByteBuffer`s work correctly, as long as the term *using them* matches what even the naivest programmer would assume. Here, the question is about the behavior about an *actually unused* object which is actually *accessed by native code concurrently*. And, as said, you should discuss you opinion about the other question’s issue that on the other question instead. – Holger Dec 14 '16 at 18:32
  • well, developers cannot analyze what is *actually unused*; they can analyze strong-reachability, and that is what VM should honor. BTW, I don't think `finalize()` requires synchronization. JMM ensures that there is happens-before from constructor to finalize(), which is enough for the purpose of finalize() accessing resources allocated in constructor. – ZhongYu Dec 14 '16 at 18:38
  • The `DirectByteBuffer` internally uses some rather complex magic around the `sun.misc.Cleaner` (extending `PhantomReference`) class to actually deallocate the native memory, but the relation to this question isn't entirely clear for me either. In any case, I consider my current solution as flawed, and try to find a solution that works more reliably (or at least, one of which one could argue that it *should* work more reliably - even if it looks horribly hackish...) – Marco13 Dec 14 '16 at 18:44
  • @Zhong Yu: you nailed it. There is a happens-before between *constructor* and `finalize()`, but not between any other action after the constructor and the `finalize()`. Unless you make the other actions *and* the `finalize()` method `synchronized` (or use any other synchronization technique). But as Marco13 correctly points out, the relation to this question is hard to see, as said multiple times, you are discussing at the wrong place. But maybe [this documentation](http://download.java.net/java/jdk9/docs/api/java/lang/ref/Reference.html#reachabilityFence-java.lang.Object-) helps you… – Holger Dec 14 '16 at 18:56
4

Execution of Runnable in a thread pool is not enough to keep an object from being garbage collected. Even "this" can be collected! See JDK-8055183.

The following example shows that keepReference does not really keep it. Though the problem does not happen with vanilla JDK (because the compiler is not smart enough), it can be reproduced when a call to ThreadPoolExecutor.afterExecute is commented out. It is absolutely possible optimization, because afterExecute is no-op in the default ThreadPoolExecutor implementation.

import java.lang.ref.WeakReference;
import java.util.concurrent.*;

public class StrangeGC {
    private static final ExecutorService someExecutor =
        Executors.newSingleThreadExecutor();

    private static void keepReference(final Object object) {
        Runnable runnable = new Runnable() {
            @SuppressWarnings("unused")
            private Object localObject = object;

            public void run() {
                WeakReference<?> ref = new WeakReference<>(object);
                if (ThreadLocalRandom.current().nextInt(1024) == 0) {
                    System.gc();
                }
                if (ref.get() == null) {
                    System.out.println("Object is garbage collected");
                    System.exit(0);
                }
            }
        };
        someExecutor.execute(runnable);
    }

    public static void main(String[] args) throws Exception {
        while (true) {
            keepReference(new Object());
        }
    }
}

Your hack with overriding afterExecute will work though.
You've basically invented a kind of Reachability Fence, see JDK-8133348.

The problem you've faced is known. It will be addressed in Java 9 as a part of JEP 193. There will be a standard API to explicitly mark objects as reachable: Reference.reachabilityFence(obj).

Update

Javadoc comments to Reference.reachabilityFence suggest synchronized block as an alternative construction to ensure reachability.

apangin
  • 92,924
  • 10
  • 193
  • 247
  • Thanks for the example and particularly the pointers. The `reachabilityFence` seems to be exactly what I'm looking for. I'll have a look at how it is actually implemented, but in any case, it's good to know that although the solution that I'll have to implement now (likely based on the overridden `afterExecute`) will be hacky, it may 1. work, 2. be a legitimate workaround for a known glitch and 3. be replaced with a "proper" solution in the near future. – Marco13 Dec 14 '16 at 20:28
  • @Marco13 JDK-specific implementation of [reachabilityFence](http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/1edd10226c97/src/java.base/share/classes/java/lang/ref/Reference.java#l416) is an empty method with `@DontInline` annotation. However, there is an interesting comment in the code telling about `synchronized` block as an alternative to this construction. – apangin Dec 14 '16 at 21:25
  • I may have to churn this method comment a bit more, but the `synchronized` alternative sounds a bit fragile and implicit, and I think there may be cases where it is not as trivially applicable as in the `this` case that is handled there in particular. For example, when a reference is thrown around (like the `localObject` or the `runnable` in my case), then there is no proper place for such a block. – Marco13 Dec 14 '16 at 23:24
  • @Marco13 Well, in your case `waitUntilCertainCondition()` can be wrapped in `synchronized (object)` and this will work. However, I agree, this does not look great either. – apangin Dec 15 '16 at 01:14
  • Well, using `synchronized` is what I already suggested in [this comment](http://stackoverflow.com/questions/41147212/#comment69499706_41148320). I suggested it, because it was explicitly mentioned in JLS§12.6.1. Note that it also explicitly stated there: “*Note that this does not prevent synchronization elimination: synchronization only keeps an object alive if a finalizer might synchronize on it.*” That’s why my suggestion also includes adding a synchronized `finalize()` method. Note that the documentation of `reachabilityFence` also says `synchronized(this)` having to be in `finalize()`. – Holger Dec 15 '16 at 19:00