12

Lets say we have following sample code in C#:

class BaseClass
  {
    public virtual void HelloWorld()
    {
      Console.WriteLine("Hello Tarik");
    }
  }

  class DerivedClass : BaseClass
  {
    public override void HelloWorld()
    {
      base.HelloWorld();
    }
  }

  class Program
  {
    static void Main(string[] args)
    {
      DerivedClass derived = new DerivedClass();
      derived.HelloWorld();
    }
  }

When I ildasmed the following code:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       15 (0xf)
  .maxstack  1
  .locals init ([0] class EnumReflection.DerivedClass derived)
  IL_0000:  nop
  IL_0001:  newobj     instance void EnumReflection.DerivedClass::.ctor()
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  callvirt   instance void EnumReflection.BaseClass::HelloWorld()
  IL_000d:  nop
  IL_000e:  ret
} // end of method Program::Main

However, csc.exe converted derived.HelloWorld(); --> callvirt instance void EnumReflection.BaseClass::HelloWorld(). Why is that? I didn't mention BaseClass anywhere in the Main method.

And also if it is calling BaseClass::HelloWorld() then I would expect call instead of callvirt since it looks direct calling to BaseClass::HelloWorld() method.

Tarik
  • 79,711
  • 83
  • 236
  • 349
  • 2
    Just throwing this idea out there, but I don't know -- this could be a compiler optimization as your method is simply calling the base implementation. – payo Apr 18 '12 at 22:59
  • @payo no; even if the override had some unique logic, the callvirt would be the same. – phoog Apr 18 '12 at 23:09
  • @phoog this is not the same for all languages -- I can see how this is the case for c# tho – payo Apr 18 '12 at 23:22
  • @payo of course it is not the same for all languages. The question was about C# so I am answering it in that context. – phoog Apr 18 '12 at 23:24
  • @phoog Just pointing out that this is not a general polymorphic rule, as some readers may later think. I just feel your answer should explicitly point out this is C# specific. As for example, c++ objects each get their vtable initialized with a pointer (if mfst) or jump (if gcc) for each virtual method - you would not call from the base at all. – payo Apr 18 '12 at 23:27
  • 2
    @payo I added "in C#" so the answer now reads "the way virtual dispatch works in C# ...." Thank you for pointing out the ambiguity. – phoog Apr 18 '12 at 23:31
  • 1
    @phoog well played :) I upvoted your answer for that. – payo Apr 18 '12 at 23:34

3 Answers3

20

The call goes to BaseClass::HelloWorld because BaseClass is the class that defines the method. The way virtual dispatch works in C# is that the method is called on the base class, and the virtual dispatch system is responsible for ensuring that the most-derived override of the method gets called.

This answer of Eric Lippert's is very informative: https://stackoverflow.com/a/5308369/385844

As is his blog series on the topic: http://blogs.msdn.com/b/ericlippert/archive/tags/virtual+dispatch/

Do you have any idea why this is implemented this way? What would happen if it was calling derived class ToString method directly? This way didnt much sense this to me at first glance...

It's implemented this way because the compiler does not track the runtime type of objects, just the compile-time type of their references. With the code you posted, it's easy to see that the call will go to the DerivedClass implementation of the method. But suppose the derived variable was initialized like this:

Derived derived = GetDerived();

It's possible that GetDerived() returns an instance of StillMoreDerived. If StillMoreDerived (or any class between Derived and StillMoreDerived in the inheritance chain) overrides the method, then it would be incorrect to call the Derived implementation of the method.

To find all possible values a variable could hold through static analysis is to solve the halting problem. With a .NET assembly, the problem is even worse, because an assembly might not be a complete program. So, the number of cases where the compiler could reasonably prove that derived doesn't hold a reference to a more-derived object (or a null reference) would be small.

How much would it cost to add this logic so it can issue a call rather than callvirt instruction? No doubt, the cost would be far higher than the small benefit derived.

Community
  • 1
  • 1
phoog
  • 42,068
  • 6
  • 79
  • 117
  • Does that mean, in other words, that what he's really seeing is polymorphism working? The IL is handled in this way so that at runtime the proper overridden method is called? – Brad Rem Apr 18 '12 at 23:16
  • @BradRem that's exactly what it means. – phoog Apr 18 '12 at 23:18
  • @BradRem that's exactly what it means in for how C# handles polymorphism (as I have now learned as well). – payo Apr 18 '12 at 23:28
  • Do you have any idea why this is implemented this way? What would happen if it was calling derived class ToString method directly? This way didnt much sense this to me at first glance... – Tarik Apr 19 '12 at 01:48
9

The way to think about this is that virtual methods define a "slot" that you can put a method into at runtime. When we emit a callvirt instruction we are saying "at runtime, look to see what is in this slot and invoke it".

The slot is identified by the method information about the type that declared the virtual method, not the type that overrides it.

It would be perfectly legal to emit a callvirt to the derived method; the runtime would realize that the derived method is the same slot as the base method and the result would be exactly the same. But there is never any reason to do that. It is more clear if we identify the slot by identifying the type that declares that slot.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
1

Note that this happens even if you declare DerivedClass as sealed.

C# uses the callvirt operator to call any instance method (virtual or not) to automatically get a null check on the object reference - to raise a NullReferenceException at the point that a method is called. Otherwise, the NullReferenceException will only be raised at the first actual use of any instance member of the class inside the method, which can be surprising. If no instance member is used, the method could actually complete successfully without ever raising the exception.

You should also remember that IL is not executed directly. It is first compiled to native instructions by the JIT compiler - and that performs a number of optimizations depending on whether you're debugging the process. I found that the x86 JIT for CLR 2.0 inlined a non-virtual method but called the virtual method - it also inlined Console.WriteLine!

Mike Dimmick
  • 9,662
  • 2
  • 23
  • 48