11

In a program I'm using the dynamic keyword to invoke the best matching method. However, I have found that the framework crashes with a StackOverflowException under some circumstances.

I have tried to simplify my code as much as possible while still being able to re-produce this problem.

class Program
{
    static void Main(string[] args)
    {
        var obj = new SetTree<int>();
        var dyn = (dynamic)obj;
        Program.Print(dyn); // throws StackOverflowException!!
        // Note: this works just fine for 'everything else' but my SetTree<T>
    }
    static void Print(object obj)
    {
        Console.WriteLine("object");
    }

    static void Print<TKey>(ISortedSet<TKey> obj)
    {
        Console.WriteLine("set");
    }
}

That program would normally print "set" if the newed up instance implements the ISortedSet<TKey> interface and print "object" for anything else. But, with the following declarations a StackOverflowException is thrown instead (as noted in a comment above).

interface ISortedSet<TKey> { }

sealed class SetTree<TKey> : BalancedTree<SetTreeNode<TKey>>, ISortedSet<TKey> {}

abstract class BalancedTree<TNode> 
    where TNode : TreeNode<TNode> { }

abstract class SetTreeNode<TKey> : KeyTreeNode<SetTreeNode<TKey>, TKey> { }

abstract class KeyTreeNode<TNode, TKey> : TreeNode<TNode>
    where TNode : KeyTreeNode<TNode, TKey> { }

abstract class TreeNode<TNode>
    where TNode : TreeNode<TNode> { }

Whether this is a bug or not it is very troubling that a StackOverflowException is thrown as we are unable to catch it and also pretty much unable to determine in advance whether an exception will be thrown (and thereby terminate the process!).

Can someone please explain what's going on? Is this a bug in the framework?

When debugging and switching to "Disassembly mode" I'm seeing this:

disassembly

Register dump at that location: register dump

EAX = 02B811B4 EBX = 0641EA5C ECX = 02C3B0EC EDX = 02C3A504 ESI = 02C2564C
EDI = 0641E9AC EIP = 011027B9 ESP = 0641E91C EBP = 0641E9B8 EFL = 00000202

That doesn't tell me much more than being an indicator that this indeed must be some kind of bug in the framework.

I've filed a bug report on Microsoft Connect but I'm interested in knowing what's going on here. Are my class declarations unsupported in some way?

Not knowing WHY this is happening causes me to worry about other places where we are using the dynamic keyword. Can I not trust that at all?

binki
  • 7,754
  • 5
  • 64
  • 110
Mårten Wikström
  • 11,074
  • 5
  • 47
  • 87
  • Whether or not C# supports calling an overload based on a dynamic, I don't know, but you could do it manually via RTTI. In the (object obj) overload, check the type, cast it, and pass it along to the appropriate overload. Sorry if that's not helpful. Can you separate the cast to dynamic and the call to Print into separate statements? It would help to establish if its the cast to dynamic that's causing the problem, or the attempt to find a suitable overload of Print. – Brent Mar 26 '14 at 21:05
  • 1
    Dynamic method invocation like this work just fine **normally**. It is just that it apparently sometimes causes a stack overflow. – Mårten Wikström Mar 26 '14 at 21:07
  • @Brent: Yes, I'll update the code. Same behavior. – Mårten Wikström Mar 26 '14 at 21:08
  • 1
    Assuming you're using Visual Studio, switch off the "Step into just my code" and try stepping into the call. If that doesn't let you observe the stack overflow occurring, try switching to disassembly. Even if you don't know assembler, you should be able to watch a large chain of recursion in your call stack before it crashes. If it's doing this all in not-your-code, then I'd suggest creating a bug report. – Brent Mar 26 '14 at 21:17
  • @Brent: I've added a screenshot from the disassembly view. Cannot step into anything recursive though... – Mårten Wikström Mar 26 '14 at 21:38
  • That seems very odd that you would not be able to step into it, unless the stack is already maxed out and it can't push the return address onto the stack - what's the esp register look like when you're at this line? – Brent Mar 26 '14 at 21:41
  • @Brent: I've added a register dump above. `ESP = 0641E91C` Does that tell you anything? – Mårten Wikström Mar 26 '14 at 21:47
  • That value should be fine. Try Shift+F11 according to http://stackoverflow.com/questions/9065735/why-visual-studios-debug-mode-step-into-f11-sometimes-doesnt-enter-inside-so – Brent Mar 26 '14 at 21:48
  • Shift+F11 didn't help :-( – Mårten Wikström Mar 26 '14 at 21:51
  • Hmm... Sorry I don't have VS on my work computer. If you can figure out some way to get stepping into external code to work though, you might stand a chance... When it crashes with a StackOverflowException are you still able to see the call stack? – Brent Mar 26 '14 at 21:52
  • Nope, no call stack. I don't think you ever get to see a call stack for that exception type. – Mårten Wikström Mar 26 '14 at 21:55
  • 2
    possible duplicate of [StackOverflowException when accessing member of nested class via a dynamic reference](http://stackoverflow.com/questions/19612325/stackoverflowexception-when-accessing-member-of-nested-class-via-a-dynamic-refer) – nawfal Jul 19 '14 at 21:17
  • @nawfal, I think that this question addresses the issue much better than the one you link, so lets mark *that* question as a dupe of this one ;-). – binki Apr 08 '15 at 14:55

3 Answers3

7

I created a shorter, more to-the-point SSCCE that illustrates the problem:

class Program
{
    static void Main()
    {
        dynamic obj = new Third<int>();
        Print(obj); // causes stack overflow
    }

    static void Print(object obj) { }
}

class First<T> where T : First<T> { }

class Second<T> : First<T> where T : First<T> { }

class Third<T> : Second<Third<T>> { }

Looking at the call stack, it seems to be bouncing between two pairs of symbols in the C# runtime binder:

Microsoft.CSharp.RuntimeBinder.SymbolTable.LoadSymbolsFromType(
    System.Type originalType
)

Microsoft.CSharp.RuntimeBinder.SymbolTable.GetConstructedType(
    System.Type type,
    Microsoft.CSharp.RuntimeBinder.Semantics.AggregateSymbol agg
)

and

Microsoft.CSharp.RuntimeBinder.Semantics.TypeManager.SubstTypeCore(
    Microsoft.CSharp.RuntimeBinder.Semantics.CType type, 
    Microsoft.CSharp.RuntimeBinder.Semantics.SubstContext pctx
)

Microsoft.CSharp.RuntimeBinder.Semantics.TypeManager.SubstTypeArray(
    Microsoft.CSharp.RuntimeBinder.Semantics.TypeArray taSrc,
    Microsoft.CSharp.RuntimeBinder.Semantics.SubstContext pctx
)

If I had to hazard a guess, some of the generic type constraint nesting you've got going on has managed to confuse the binder into recursively walking the types involved in the constraints along with the constraints themselves.

Go ahead and file a bug on Connect; if the compiler doesn't get caught by this, the runtime binder probably shouldn't either.


This code example runs correctly:

class Program
{
    static void Main()
    {
        dynamic obj = new Second<int>();
        Print(obj);
    }

    static void Print(object obj) { }
}

internal class First<T>
    where T : First<T> { }

internal class Second<T> : First<Second<T>> { }

This leads me to believe (without much knowledge of the internals of the runtime binder) that it's proactively checking for recursive constraints, but only one level deep. With an intermediary class in between, the binder ends up not detecting the recursion and tries to walk it instead. (But that's all just an educated guess. I'd add it to your Connect bug as additional information and see if it helps.)

Adam Maras
  • 26,269
  • 6
  • 65
  • 91
  • @MårtenWikström I just added a working example and hazarded a guess as to what's going on. Perhaps you can include that as well. – Adam Maras Mar 26 '14 at 22:14
  • Probably should have waited for Jon Skeet to wake up and answer your question before filing a bug report... :) – Dean Kuga Mar 26 '14 at 22:30
  • "This leads me to believe (without much knowledge of the internals of the runtime binder) that it's proactively checking for recursive constraints, but only one level deep." Almost the opposite. It's proactively setting the base type's details, but that will recurse on more than two generic types in a hierarchy and it's that recursive case that can't handle cyclical definitions. – Jon Hanna Nov 11 '17 at 18:15
  • Incidentally, the two pairs of methods were a bit of a red herring. When working correctly those two can both recurse for a few levels, using a relatively large amount of stack. That makes them the most likely to be at the end of the stack when the problem in `TypeManager.GetAggregate()` recursed more than it should, so they looked like the culprits though they were innocent. – Jon Hanna Nov 11 '17 at 21:24
3

The problem is that you are deriving a type from itself:

abstract class SetTreeNode<TKey> : KeyTreeNode<SetTreeNode<TKey>, TKey> { }

The type SetTreeNote<TKey> becomes KeyTreeNode<SetTreeNode<TKey>,TKey> which becomes KeyTreeNode<KeyTreeNode<SetTreeNode<TKey>,TKey>,TKey> and this goes on and on until the stack overflows.

I don't know what you are trying to accomplish by using this complex model, but that is your problem.

I managed to reduce it to this example which fails:

interface ISortedSet<TKey> { }

sealed class SetTree<TKey> : BalancedTree<SetTreeNode<TKey>>, ISortedSet<TKey> { }

abstract class BalancedTree<TNode> { }

abstract class SetTreeNode<TKey> : KeyTreeNode<SetTreeNode<TKey>, TKey> { }

abstract class KeyTreeNode<TNode, TKey> : TreeNode<TNode> { }

abstract class TreeNode<TNode> { }

And then I fixed it by doing this:

interface ISortedSet<TKey> { }

sealed class SetTree<TKey> : BalancedTree<SetTreeNode<TKey>>, ISortedSet<TKey> { }

abstract class BalancedTree<TNode> { }

abstract class SetTreeNode<TKey> : KeyTreeNode<TKey, TKey> { }

abstract class KeyTreeNode<TNode, TKey> : TreeNode<TNode> { }

abstract class TreeNode<TNode> { }

The only difference between the two is that I replaced KeyTreeNode<SetTreeNode<TKey>, TKey> with KeyTreeNode<TKey, TKey>

Ove
  • 6,227
  • 2
  • 39
  • 68
  • `abstract class SetTreeNode : KeyTreeNode, TKey> { }`: +1, that was what I thought. – alex.b Mar 26 '14 at 22:05
  • 2
    So maybe there's a bug in that the compiler actually compiles it instead of disallowing it? – Charlie Mar 26 '14 at 22:07
  • 2
    If that's the case, then why does the code not only compile, but allow the type to be instantiated? – Adam Maras Mar 26 '14 at 22:09
  • 1
    @Dan-o but... it wasn't the runtime that blew up, it was the C# binder. That's proved by the stack trace. Were it the runtime itself that had an issue instantiating an object of that type, 1) the stack trace would show methods from `mscorlib` or native code inside the CLR and 2) it would have most likely ended with a `TypeLoadException`. – Adam Maras Mar 26 '14 at 22:26
  • Thank you. I've withdrawn my comment. – Sam Axe Mar 27 '14 at 01:19
1

Can someone please explain what's going on? Is this a bug in the framework?

Yes.

The issue is in the way that generic types are resolved into their particular concrete uses.

Okay, let's start with some obvious stuff so as to build up to what the compiler is getting wrong. As you know, with something like List<int> the compilers (whether it's the dynamic compiler, or any of the static compilers since C#2 introduced generics) have to take the List<> type and the int type and combine information about both of those to produce the List<int> type.

Now, consider:

public class Base<T, U>
{

}

public class Derived<T> : Base<T, int>
{

}

Derived<long> l = new Derived<long>();

Here you can see that in the same work on the types Derived<T> and long the compiler has to fill three slots:

  1. The T defined on Derived<>, which gets filled with long.
  2. The T defined on Base<,> which gets filled with the T defined on Derived<>, which was filled with long.
  3. The U defined on Base<,> which gets filled with int.

When you consider nested classes, long inheritance chains, generic types derived from other generic types and adding further generic parameters, and so on, you can see that there's a good few different permutations to cover. If you're starting with Derived<long> and have to answer the question "what is the base type of the class?" (which obviously compilers need to consider a lot) then all of this has to be worked out.

The dynamic compiler was based on the pre-Roslyn static compiler, which was based on the compiler before that which was actually written in C++ rather than C# (there is still quite a lot of the dynamic compiler that while it's in C#, smells of C++). The can be considered as being more similar in end point (code that can be executed) than in start point; a bunch of text that has to be parsed for the static compiler to understand what types and operations are involved vs the dynamic compiler starting with already-existing types and operations represented by objects and flags.

One thing they both need to know that is that if a type is mentioned several times, that it's the same type (that's pretty much the most basic definition of what a type means, after all). If we compile new List<int>((int)x) that would obviously not work unless it knew int meant the same thing both times. They also need to avoid chewing up gigs of RAM.

Both problems are solved by a hash-consing or flyweight-like approach. When it goes to construct the object representing a particular type, it first sees if it has already constructed that type, and only constructs a new one if necessary. This also helps a lot of relationships within hierarchies be correctly constructed, though clearly not the particular case in your question.

For most types (all except for a few special cases like pointers, references, arrays, nullables [though there's an exception to that exception], type parameters… okay, there's actually quite a few exceptions) the state is mostly three things:

  1. A symbol representing the type without specific type-parameters (which is the totality of the representation for non-generic types) but which does include type parameters of the generic definition (for Dictionary<int, int> it has the TKey and TValue of Dictionary<TKey, TValue>).
  2. The set of the types that are parameters directly on the type (whether the T of List<T> for the open type, the int of List<int> for the constructed type, or a mix for e.g. Dictionary<T, int> relative to some generic type or method that defines T).
  3. The set of the type parameters that are either directly on the type (as above) or on an outer type it is nested in.

Okay, so far, so good. If it needs to do something with List<int>.Enumerator it first either finds the List<T> symbol in the store, or adds it if new, then finds List<T>.Enumerator symbol in the store, or adds it if new, then finds int in the store (int is preloaded as a very common type) and finally finds the type that combines List<T>.Enumerator with int in the store, or adds it if new. We now have the only List<int>.Enumerator type object.

The problem, which caused your bug, comes in at the end of this last step. Consider what we said above about having to assign types to base types when creating a concrete implementation of a type. The base type of a concrete generic type is a concrete type, potentially itself a concrete generic type, but the information we have here is of the generic type and some type arguments: We do not know what the concrete generic type is.

The method for finding the base type is lazy-loaded, but calls into the symbol that doesn't know the type parameters to use.

The solution used was to temporarily define that symbol's base type in terms of the concrete base type, call the lazy-loading base-type method, and then set it back again.

I don't know why something was lazy-loaded when it was called immediately after creation. At a guess, I'd say it was something that made more sense in terms of the static compilation, and so got ported over that way rather than rewrite the mechanism from scratch (which would be a more risky approach in most regards).

This works pretty well, even with quite complicated hierarchies. However, if there's a hierarchy that is both circular in terms of type parameters and has more than one step before a non-generic type (such as object) is reached (so the fix-up has to recurse on a base type too) then it isn't able to find the type it's in the process of making (remember the bit about storing the objects for types) because it's been temporarily changed to make the fix-up work, and has to make it again. And again, and again, until you hit a StackOverflowException.

From Adam Maras' answer:

This leads me to believe (without much knowledge of the internals of the runtime binder) that it's proactively checking for recursive constraints, but only one level deep.

It's almost the opposite, in that the problem is proactively setting the base class in such a way as to prevent it realising it already has a type it needs. I think I managed to fix it today though it remains to be seen if someone sees some problem with that fix I missed (a good thing about contributing to the framework is they have a high standard of code reviews, but that of course means I can't be sure a contribution will be accepted until it's in).

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
  • Thank you for this well written and educating answer! – Mårten Wikström Nov 11 '17 at 18:48
  • Thanks for the question. After figuring this out (which was suddenly smooth enough despite bugging me for ages) and then following links from related items in github to see if there was any variant cases I should be considering in tests, it was quite nice to have a question asking what I'd just spent time figuring out. Now the wait to see if my fix doesn't introduce some worse flaw I hadn't noticed. (If it's a good fix, whether it gets ported from corefx to netfx to close your Microsoft Connect issue is another thing; netfx has a higher level of caution about changes than corefx). – Jon Hanna Nov 11 '17 at 19:16
  • Hurray, the fix was accepted. The next version of .NET Core will have this fixed, though whether it makes it into another version of the desktop versions I can't say. – Jon Hanna Nov 21 '17 at 09:55