Does a managed pointer to a value-type field keep its containing GC instance alive?

Question

_{[author's edit:] Indeed this appears similar to a post from 3 months ago. Both remain valuable, however, since they show different examples, and this question, in particular, generated an important discussion containing valuable expert information in the comments section.}

With ref local and ref return in C# 7, it's now possible to retain a managed pointer to the interior of a reference-type class instance indefinitely.^1. Since the interior reference could be to a ValueType, it feels like this could perhaps introduce the possibility of the managed pointer in a sense being orphaned, should the host itself get collected.

Here's a complete and self-contained example. Discussion continues in the comments within, and below.

class outer { public int m_i; };  // outer reference type containing a value-type field

static class _
{
    // the lone reference to a single global instance of 'outer' in the GC heap
    static outer host_inst = new outer();

    // (function here for clarity, and to prevent distraction over 'keep-alive' concerns)
    static ref int GetInteriorPointer()
    {
        ref int pi = ref host_inst.m_i;   // managed pointer to interior value-type
        host_inst = null;                 // abandon the heap object itself
        return ref pi;                    // but return the interior pointer
    }

    static _()
    {
        ref int pi = ref GetInteriorPointer();

        // At this point, the 'outer' container instance itself is not reachable, since
        // its only reference was nulled-out. Normally, when an object's last GC handle 
        // goes out of scope it is eligible for garbage collection.

        // However, in this case we still have access to a managed pointer to a field
        // inside the object.
        
        // Is 'pi' sufficient to prevent the collection of host_inst here?   <-- Question

        pi++;       // (possible keep-alive... for an inaccessible GC object?)  
    }
};

The question is pretty straightforward:

If you have a managed pointer to an interior ValueType field of a reference-type instance, but there are no handles to the outer instance itself, is the existence of the managed pointer sufficient for keeping the host alive?

The reason one might suspect "no" is that, from IL it seems clear that a managed pointer is IntPtr-sized, that is, a 32- or 64-bit value just like an object reference--but in this case without the benefit of any "object header" or other obvious metadata support.^2. From this, it seems like they are too raw or "stripped-down" to efficiently participate in the GC reachability graph. So if a managed ref represents the last remaining way to access (some interior portion of) some object, but such ref pointers are not tracked in the GC graph, the object would be subject to collection.

So if the answer is "yes," how does this work? Are there any situations where the CLR has to recover a containing object given only a managed pointer, and if so, how would this be done and is it efficient?
If the answer is "no" and the host does happen to get collected, what is the fate of the managed pointer? Does it then point to garbage and corrupt the CLR upon use, or does it somehow handle failure more gracefully?

Note that the fix would be simple but does affect overall design: it just becomes the responsibility of the developer to make sure the program independently keeps a full-blown reference to the outer instance somewhere else, and ensures that it outlives usage the managed pointer.

_{[1.] Indefinitely meaning, at least as far as the code that originally issues the managed pointer is concerned, since the client can use or abandon the pointer at will, without ever having to report back.

[2.] Lacking metadata, since the IL instructions which manipulate managed pointers require a TypeRef to be supplied as "burned-in" to the instruction stream.

Related: Recover containing GC object from managed 'ref' interior pointer}

@ThomasWeller Is that an `SOS` command? I haven't been able to get `SOS` working in the VS2017 debugger (if that's even possible) and I don't really have any experience with the kernel debugger. It's a great idea--I thought I'd have to batter looped iterations with "memory pressure" and see if it fails, and dread that it's basically inconclusive if it doesn't. On the other hand, the example code is complete and works if you have a chance to drop it into a console app yourself. — Glenn Slayden, Jan 16 '18 at 17:59
Yes, that's a command from SOS. You can use it in WinDbg. I'll check if I can try. — Thomas Weller, Jan 16 '18 at 18:33
The question is full of misunderstandings. "It is now possible" -- no, it was always possible in IL; the feature is now exposed from C#, but you could have written a program in IL that takes an interior pointer at any time. "indefinitely" -- what does "indefinitely" mean? You cannot put a managed interior pointer into a field or array, so you can only retain one *during the activation of a method*, which is the definition of "short lived" in .NET. — Eric Lippert, Jan 16 '18 at 18:41
The answer to the specific question is: of course having a living interior pointer to the inside of an object keeps the object alive. How does it work? Extremely well. How is it made efficient? The GC guys are pretty smart and know how to design a data structure that can track interior pointers. How exactly it is done is an implementation detail. — Eric Lippert, Jan 16 '18 at 18:42
I'll point out also that even since C# 1.0 the compiler took advantage of the ability to take an interior pointer; it just didn't expose that capability directly to developers. Multi-dimensional array element assignment uses interior pointers for efficiency. — Eric Lippert, Jan 16 '18 at 18:47
And finally, you could ever since C# 1.0 do the *moral* equivalent of making a managed pointer via this technique: https://stackoverflow.com/questions/4764573/why-is-typedreference-behind-the-scenes-its-so-fast-and-safe-almost-magical -- which we did not document because it is horrid and useful only to a small number of interop library scenarios. — Eric Lippert, Jan 16 '18 at 18:50
@EricLippert I explained what I meant by 'indefinitely' in the footnote which I guess you didn't see. And there could be an infinite loop in the "method activation" (or its entire sub-graph of callees, which can be expansive by the way, since we're nit-picking). As for the clause "it is now possible," it clearly refers to C# in the English I speak, so my statement does not incorrectly exclude `IL`. Finally, I didn't imply anything about GC developers. Is curiosity a crime? *Not* "full of misunderstandings", and sorry, snark not appreciated. — Glenn Slayden, Jan 16 '18 at 19:04
My point is that both believing that this is in any way a fundamentally new feature, and believing that the storage for managed pointers can ever have *indefinite lifetime* are misunderstandings that will prevent you from correctly understanding the feature and its implications. If you already correctly understood the feature and its implications you wouldn't need to ask the question. There's nothing at all wrong with curiosity, and I encourage it. — Eric Lippert, Jan 16 '18 at 19:13
The idea that the lack of a header means there is insufficient information to efficiently determine what the top of an allocation is given a pointer to the middle of an allocation is also a fundamental misunderstanding; that's an easy problem to solve efficiently. — Eric Lippert, Jan 16 '18 at 19:14
In short: the purpose of managed memory is to *manage memory*. Managed pointers are *correctly managed* by the managed memory manager; *that's what makes it managed*. The situations where you cannot rely on the memory manager are... wait for it... *interoperability with unmanaged resources*. That's why they're called *unmanaged*. Again: managed pointers are called managed pointers because they are correctly managed by the managed pointer memory manager. You can trust it. — Eric Lippert, Jan 16 '18 at 19:17
@EricLippert Your remark mentioning "...middle of an allocation..." exquisitely summarized the precise nub of my curiosity. Of course there are solutions to the specific problem, some of them "easy," some insanely clever, and most, I would guess, of well-worn stock, such that, for the higher-end of this audience at least, a few sentences of synopsis on choice points would probably sufficiently convey the nature of the implemented design vs. known alternatives. It's a hopeful plea, since I always reject the "implementation detail" brush-off, and am way too busy to enjoy a dig into the source. — Glenn Slayden, Jan 16 '18 at 20:02
Well I don't know what the implementation details are, but let's think about what we might do if given the task of identifying the header of a managed object given an interior pointer. First, we know that *managed allocations are confined to compacted arenas*: one per generation, plus the Large Object Heap. So given any interior pointer, we can *immediately* determine what generation it is from and whether the allocation was large or small. Now that we have this information, we can decide what algorithm to use. — Eric Lippert, Jan 16 '18 at 20:15
For example, we might realize that if there's an interior pointer on the stack then there is probably a pointer to the object header somewhere nearby. So we could start by making a list of all the living roots in the current activation frame. That's probably less than a dozen objects, so we could either sort them and binary search, or do a quick linear search to see which of those headers is the closest pointer smaller than the interior pointer, and then check to see if the interior pointer is inside it. If it is, then we can ignore it since we're already going to make the header a root. — Eric Lippert, Jan 16 '18 at 20:19
Suppose then that it isn't; we could maintain a sorted list of valid allocation addresses in the compaction phase of the collection, and then binary-search that list to find the nearest smaller object header, and then keep that object alive for this collection. — Eric Lippert, Jan 16 '18 at 20:21
Or, we could put a series of "mile markers" in the managed heap, scan backwards until we find a candidate mile marker, and then hang some data structure off the mile marker that gives the locations of all the living objects within that region of the heap. This puts a nice bound on the amount of time spent searching without taking up too much extra memory. — Eric Lippert, Jan 16 '18 at 20:23
I don't know what techniques the GC team actually uses; my point is that there are a lot of possibilities and they are all cheap compared to the existing overhead of a compacting GC, which is already moving around possibly huge amounts of memory on each collection. A few extra bytes here and there for auxiliary data structures is nothing. — Eric Lippert, Jan 16 '18 at 20:24
@EricLippert Well [here](https://raw.githubusercontent.com/dotnet/coreclr/master/src/gc/gc.cpp) is the code. `gc_heap::find_object` calls `gc_heap::find_first_object` and on first glance it sure looks like a lot of scanning to me. [This comment](https://stackoverflow.com/questions/47419240#comment81793007_47419240) is also informative. — Glenn Slayden, Jan 16 '18 at 20:30
@EricLippert To be clear, I quite surely *don't* misunderstand the transience constraints on managed pointers, in fact so much so that my mischievous turn "indefinite lifetime"--while partly in jest--is pointedly so. Per obligation after deploying an (apparently effective) attention-getter like that, I was then responsible to (and did duly) note how it instructively highlights the situation from the perspective of the issuer, the belabored point here being that the origin *can't know any differently* since there's no mechanism for it to track the interior pointer's end-of-life. — Glenn Slayden, Jan 16 '18 at 21:17
@EricLippert But your lasting point, for which I thank you, is basically that... It works; it's robust, mainstream, and not an edge-case.... i.e. You can trust it. — Glenn Slayden, Jan 16 '18 at 21:21

Does a managed pointer to a value-type field keep its containing GC instance alive?

0 Answers0

Linked