3

I understand what a virtual function is. But what I don't get is how do they work internally?

class Animal
{
    virtual string Eat()
    {
        return @"Eat undefined";
    }
}

class Human : Animal
{
    override string Eat()
    {
         return @"Eat like a Human";
    }
}


class Dog : Animal
{
    new string Eat()
    {
         return @"Eat like a Dog";
    }
}

static void Main()
{
    Animal _animal = new Human();
    Console.WriteLine(_animal.Eat());
    _animal = new Dog();
    Console.WriteLine(_animal.Eat());
}

Output for the above gives:

Eat like a Human
Eat undefined

In the above code _animal is of type Animal which references a Human object or Dog object. What does this mean? I understand in the memory _animal contains an address which will point to Human or Dog object. How does it decide which function to invoke. In the first case I override and hence child's implementation is called, but in second case I use new and hence the parent's implementation is called. Can you please explain me what happens under the hood?

Thanks in advance Nick

Nishant
  • 905
  • 1
  • 16
  • 36
  • 1
    Do you know Eric Lippert is writing blog series on this subject? Please refer to http://blogs.msdn.com/b/ericlippert/archive/2011/03/17/implementing-the-virtual-method-pattern-in-c-part-one.aspx – Learner Mar 23 '11 at 08:56
  • Thanks Learner. I am following it :) – Nishant Apr 05 '11 at 03:03

3 Answers3

17

It works like this. Imagine the compiler rewrote your classes into this:

class VTable
{
    public VTable(Func<Animal, string> eat)
    {
        this.AnimalEat = eat;
    }
    public readonly Func<Animal, string> AnimalEat;
}

class Animal
{
    private static AnimalVTable = new VTable(Animal.AnimalEat);
    private static string AnimalEat(Animal _this)
    { 
        return "undefined"; 
    }
    public VTable VTable;
    public static Animal CreateAnimal() 
    { 
        return new Animal() 
            { VTable = AnimalVTable }; 
    }
}

class Human : Animal
{
    private static HumanVTable = new VTable(Human.HumanEat); 
    private static string HumanEat(Animal _this)
    {
        return "human"; 
    }
    public static Human CreateHuman()
    {
        return new Human() 
            { VTable = HumanVTable };
    }
}

class Dog : Animal
{
    public static string DogEat(Dog _this) { return "dog"; }
    public static Dog CreateDog()
    {
        return new Dog() 
            { VTable = AnimalVTable } ;
    }
}

Now consider these calls:

Animal animal;
Dog dog;
animal = new Human();
animal.Eat();
animal = new Animal();
animal.Eat();
dog = new Dog();
dog.Eat();
animal = dog;
animal.Eat();

The compiler reasons as follows: If the type of the receiver is Animal then the call to Eat must be to animal.VTable.AnimalEat. If the type of the receiver is Dog then the call must be to DogEat. So the compiler writes these as:

Animal animal;
Dog dog;
animal = Human.CreateHuman(); // sets the VTable field to HumanVTable
animal.VTable.AnimalEat(animal); // calls HumanVTable.AnimalEat
animal = Animal.CreateAnimal(); // sets the VTable field to AnimalVTable
animal.VTable.AnimalEat(animal); // calls AnimalVTable.AnimalEat
dog = Dog.CreateDog(); // sets the VTable field to AnimalVTable
Dog.DogEat(dog); // calls DogEat, obviously
animal = dog;
animal.VTable.AnimalEat(animal); // calls AnimalVTable.AnimalEat

That is exactly how it works. The compiler generates vtables for you behind the scenes, and decides at compile time whether to call through the vtable or not based on the rules of overload resolution.

The vtables are set up by the memory allocator when the object is created. (My sketch is a lie in this regard, since the vtable is set up before the ctor is called, not after.)

The "this" of a virtual method is actually secretly passed as an invisible formal parameter to the method.

Make sense?

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 7
    @Eric: Hey Eric, I noticed you updated your info from being a senior developer to a principal developer. I wasn't on SO for 3 days, so I assume you updated on the weekend :O Just wanted to say congratulations on your new position/promotion, you deserve it if I may say so. Also does this mean anything and everything that's C# will go through you? I hope so, because you are one of the best developers over there IMO. – Joan Venge Mar 15 '11 at 17:08
  • @Joan: Thanks for the kind words. Of course I am far from the final word on C#. I am one of the more junior members of this team; I only have fifteen years designing and implementing programming languages. I work with Anders Hejlsberg, Neal Gafter and Peter Golde just to name a few. Those guys are far more senior than I am. – Eric Lippert Mar 15 '11 at 18:56
  • 3
    @Eric: It's funny if you still consider yourself a junior, I don't know what people below you would be :O From where I work, even though it's not a software company, principal is very high up. But I guess at a software giant like Microsoft, there are far more grades in between. Either way though it would be a very valuable experience to be able to converse with you guys. Looking forward to see you move up in ranks and happiness. – Joan Venge Mar 15 '11 at 19:33
  • For those more familar with C++, it might be worth pointing out that Animal.VTable, would be a VReference. C++ compilers generally use a VPointer, but a VReference works just fine too. In either case, it refers or points to a static VTable corresponsing to the actual type of the object, rather than the type of the variable or expression you are calling the function through. – Kevin Cathcart Mar 18 '11 at 15:01
  • @Eric: Yes it makes absolute sense. It has helped me get a better picture now. Thank you very much. – Nishant Mar 19 '11 at 18:13
  • This is just to inform all that Eric Lippert is writing a blog series on this subject. Please enjoy it at http://blogs.msdn.com/b/ericlippert/archive/2011/03/17/implementing-the-virtual-method-pattern-in-c-part-one.aspx – Learner Mar 23 '11 at 08:57
0
I understand in the memory _animal contains an address which will point to Human or Dog object. How does it decide which function to invoke.

Like data, code also has an address.

Therefore the typical approach to this problem is for Human or Dog objects to contain the address of the code of their methods. This is sometimes called using a vtable. In a language like C or C++ this concept is also directly exposed as what's called a function pointer.

Now, you've mentioned C#, which has a pretty high-level type system, in which types of objects are also discernible at runtime.... Therefore the implementation details may differ from the traditional approach in some way. But, as to your question, the function pointer/v-table concept is one way to do it, and it would surprise me if .NET has strayed too much from this.

asveikau
  • 39,039
  • 2
  • 53
  • 68
  • Thanks. Can you please help me with the memory allocation. You said code also has an address. How is memory allocated in this case? In Dog class it merely hides the base class methods while in Human class it overrides. – Nishant Mar 15 '11 at 04:45
  • @nick - When your EXE or DLL is loaded, the OS manages the memory for its the code. (In the case of C# there is also JIT involved, so the .NET runtime will manage the memory involved as well.) – asveikau Mar 15 '11 at 05:03
  • To address the point in your last sentence: you might be slightly surprised at how interfaces work. The jitter typically generates "classic" vtables for virtual method calls on class hierarchies. However, the code generated for virtual calls on interface methods is a bit more complicated. It's not the same as the typical indirect vtable that a C++ compiler would produce. – Eric Lippert Mar 15 '11 at 06:59
0

In C#, derived classes must provide the override modifier for any overridden method inherited from a base class.

Animal _animal = new Human();

It's not just the Human object got constructed. They are two sub-objects. One is Animal sub-object and the other is Human sub-object.

Console.WriteLine(_animal.Eat());

When made the call to _animal.Eat();, the run time checks whether the base class method ( i.e., Eat() )is overridden in the derived class. Since, it is overridden, the corresponding derived class method is called. Hence the output -

Eat like a Human

But, in case of -

_animal = new Dog();
Console.WriteLine(_animal.Eat());

In the Dog, there is no Eat() overridden method in the derived class Dog. So, base class method itself is called. Also this method of checking is done because in the base class, Eat() is mentioned as virtual and calling mechanism is decided at run-time. To sum up, virtual calling mechanism is a run-time mechanism.

Mahesh
  • 34,573
  • 20
  • 89
  • 115