Clarification Needed on C++ Virtual Call Implementation

Question

I have some doubts regarding virtual function or better we can say Run Time Polymorphism. According to me, I assumed the way it works as below,

A Virtual Table (V-Table) will be created for every class that has at least one virtual member function. I believe this is static table and so it is created for every class and not for every object. Please correct me in this if I am wrong here.
This V-Table has the address of the virtual function. If the class has 4 virtual functions, then this table has 4 entries pointing to the corresponding 4 functions.
Compiler will add a virtual pointer (V-Ptr) as a hidden member of the class. This virtual pointer will point to the starting address in the virtual table.

Assume I have program like this,

class Base
{
    virtual void F1();
    virtual void F2();
    virtual void F3();
    virtual void F4();
}
class Der1 : public Base  //Overrides only first 2 functions of Base class
{
    void F1(); //Overrides Base::F1()
    void F2(); //Overrides Base::F2()
}
class Der2 : public Base  //Overrides remaining functions of Base class
{
    void F3(); //Overrides Base::F3()
    void F4(); //Overrides Base::F4()
}
int main()
{
    Base* p1 = new Der1; //Believe Vtable will populated in compile time itself
    Base* p2 = new Der2;
    p1->F1(); //how does it call Der1::F1()
    p2->F3(); //how does it call Base::F3();
}

If the V-Table gets populated in compile time, why do call it as Run Time Polymorphism ?. Please explain me how many vtables and vptr and how it works using the above example. According to me 3 Vtables will be there for Base, Der1 and Der2 class. In Der1 Vtable,it has address of F1() and F2() of its own, whereas for F3() and F4() the address will point to Base class. Also 3 Vptr will be added as hidden member in Base, Der1 and Der2 class. If everything is decided at compile time, What happens exactly during the run time ?. Please correct me if I am wrong in the concept.

possible duplicate of [your C++ book](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) — Lightness Races in Orbit, Feb 08 '13 at 14:54
A single v-table for each class, a pointer in each object, each virtual function override has the same offset from the pointer to the address of the function. At runtime you don't know which vtable the object is pointing to, but because the function pointers have the same offset you get to call the right function for the specific object. — Alex, Feb 08 '13 at 15:01
If you're asking about how virtual calls are usually implemented in compilers (by usually, I quite literally mean: *almost always*) then you should clarify that. It seems that a few people are mistakening this as a question about C++ virtual call semantics - in which how it's implemented is not specified (even though everybody does it using Vtables). — Mysticial, Feb 08 '13 at 15:02

score 5 · Answer 1 · edited Feb 08 '13 at 15:26

It's obviously implementation defined, but most implementations are fairly similar, more or less along the lines you describe.

This is correct.
vtables contain more than just pointers to functions. There's usually an entry pointing to the RTTI information, and often some information concerning how to fix up the this pointer when calling the function (although this can also be done using trampolines). In the case of virtual bases, there could also be an offset to the virtual base.
This is also correct. Note that during construction and destruction, the compiler will change the vptr as the dynamic type of the object changes, and that in the case of multiple inheritance (with or without virtual bases), there will be more than one vptr. (The vptr is at a fixed offset with respect to the base address of the class, and in the case of multiple inheritance, not all classes can have the same base address.)

As to your final remarks: the vtables are populated at compile time, and are static. But the vptr's are set at runtime, according to the dynamic type, and the function call uses it to find the vtable and dispatch the call.

In your (very simple) example, there are three vtable, one for each class. Because only simple inheritance is involved, there is only one vptr per instance, shared between Base and the derived class. The vtable for Base will contain four slots, pointing to Base::f1, Base::f2, Base::f3 and Base::f4. The vtable for Der1 will also contain four slots, pointing to Der1::f1, Der1::f2, Base::f3 and Base::f4. The vtable for Der2 will point to Base::f1, Base::f2, Der2::f3 and Der2::f4. The constructor for Base will set the vptr to the table of Base; the constructor for the derived classes will first call the constructor for the base class, then set the vptr to the vtable corresponding to its type. (In practice, in such simple cases, the compiler is probably capable of determining that the vptr is never used in the constructor to Base, and so skip setting it. In more complicated cases, where the compiler cannot see all of the behavior of the base class constructor, however, this is not the case.)

As to why it is called runtime polymorphism, consider a function:

void f(Base* p)
{
    p->f1();
}

The function actually called will be different, depending on whether p points to a Der1 or a Der2. In other words, it will be determined at runtime.

Sorry again for my delayed response. Your post is very helpful to me. "there is only one vptr per instance" this is the thing I actually expected in the answers. — Prabu, Feb 09 '13 at 07:34

Alex · Accepted Answer · 2013-02-08T15:40:12.813

4

The C++ standard doesn't specify how virtual function calls have to be implemented, but here's a simplified example of the approach that is universally accepted.

From a high-level perspective, the v-tables would look like this:

Base:

Index |  Function Address
------|------------------
    0 |  Base::F1
    1 |  Base::F2
    2 |  Base::F3
    3 |  Base::F4

Der1:

Index |  Function Address
------|------------------
    0 |  Der1::F1
    1 |  Der1::F2
    2 |  Base::F3
    3 |  Base::F4

Der2:

Index |  Function Address
------|------------------
    0 |  Base::F1
    1 |  Base::F2
    2 |  Der2::F3
    3 |  Der2::F4

When you create p1 and p2, they get a pointer that points to Der1's vtable and Der2's vtable, respectively.

The call to p1->F1 basically means "call function 0 on p1's virtual table". vptr[0] is Der1::F1, so it gets called.

It's called run-time polymorphism because the function that will be called for a specific object is determined at run-time (by making a look-up in the object's vtable).

edited Feb 08 '13 at 15:40

answered Feb 08 '13 at 15:16

Alex

7,728
3
35
62

Would look, but aren't guaranteed to look in any specific way. – Bartek Banachewicz Feb 08 '13 at 15:28
@BartekBanachewicz Of course, the model I've described is a simplification, a possible way to implement late binding. It doesn't account for multiple inheritance either. Its purpose is to illustrate the idea behind the virtual tables, not to describe any actual implementation ;) – Alex Feb 08 '13 at 15:34
@Alex, Sorry for the delayed response. Your detailed answer is very informative to me. – Prabu Feb 09 '13 at 07:30

score 2 · Answer 3 · answered Feb 08 '13 at 14:50

2

It's implementation defined. When programming in C++, the only thing that should concern you is that if you declare a method virtual, the run-time contents of the object behind the pointer or reference will decide what code will be called.

Perhaps you should read about that topic first. Here is the C++ specific stuff.

answered Feb 08 '13 at 14:50

Bartek Banachewicz

38,596
7
91
135

This is the answer. There is no need to form fetishes over _virtual tables_ when they are simply a possible way to implement runtime polymorphism and should have no bearing on your programming whatsoever. – Lightness Races in Orbit Feb 08 '13 at 14:56
@Lightness Shouldn't you care about such implementation detail if performance is an issue? – JBentley Feb 08 '13 at 15:02
@JonBentley If you care about performance you *shouldn't* be using late binding at all. – Bartek Banachewicz Feb 08 '13 at 15:03
2

@Bartek Perhaps, but often there is a tradeoff between things such as code readability / programmer convenience, and performance. Using a higher level language such as C++ in the first place is such a tradeoff. So if we accept the premise that such tradeoffs can be worth making, then I would say understanding the performance cost of virtual methods is useful. I disagree that having an understanding of how your code will be compiled "should have no bearing on your programming whatsoever". Such an understanding enables you to make informed choices. – JBentley Feb 08 '13 at 15:07
@JonBentley If you need runtime dispatch, you need runtime dispatch. Generally speaking, polymorphic calls is one of the more efficient means of acheving this. – James Kanze Feb 08 '13 at 15:19
2

@JonBentley: No, benchmarking enables you to make informed choices. Obsessing over implementation details encourages you to make choices for the wrong reasons, and to get locked into them. – Lightness Races in Orbit Feb 08 '13 at 15:26
@JonBentley Knowing how it is implemented won't necessarily tell you much with regards to performance issues. (On the other hand, I can understand wanting to understand it, just out of curiosity.) – James Kanze Feb 08 '13 at 15:44
@JonBentley: If the cost of dynamic dispatch is important in your benchmarks, then you are at the point where the answer to this generic question won't help you either: you need to look at the generated code in your particular implementation. There are multiple ways of implementing virtual dispatch, and multiple ways of handling vtables, but the *intent* is that the overhead is minimal. At the same time, the points where the impact is greater are not even considered in this question (multiple inheritance, virtual inheritance, covariant return types...) – David Rodríguez - dribeas Feb 08 '13 at 15:58

score 0 · Answer 4 · answered Feb 08 '13 at 15:20

I'm not going to go through four virtual functions and three derived types. Suffice it to say: for the ultimate base class, the vtable has pointers that point to the base class' version of all the virtual functions. For derived classes, the vtable has pointers to all of the derived class's virtual functions; when the derived class overrides a base class function, the function pointer for that function points to the derived class' version of that virtual function; when the derived class inherits a virtual function, the function pointer points to the inherited function.

Clarification Needed on C++ Virtual Call Implementation

4 Answers4