How does the C++ compiler know which implementation of a virtual function to call?

Question

Here is an example of polymorphism from http://www.cplusplus.com/doc/tutorial/polymorphism.html (edited for readability):

// abstract base class
#include <iostream>
using namespace std;

class Polygon {
    protected:
        int width;
        int height;
    public:
        void set_values(int a, int b) { width = a; height = b; }
        virtual int area(void) =0;
};

class Rectangle: public Polygon {
    public:
        int area(void) { return width * height; }
};

class Triangle: public Polygon {
    public:
        int area(void) { return width * height / 2; }
};

int main () {
    Rectangle rect;
    Triangle trgl;
    Polygon * ppoly1 = &rect;
    Polygon * ppoly2 = &trgl;
    ppoly1->set_values (4,5);
    ppoly2->set_values (4,5);
    cout << ppoly1->area() << endl; // outputs 20
    cout << ppoly2->area() << endl; // outputs 10
    return 0;
}

My question is how does the compiler know that ppoly1 is a Rectangle and that ppoly2 is a Triangle, so that it can call the correct area() function? It could find that out by looking at the "Polygon * ppoly1 = &rect;" line and knowing that rect is a Rectangle, but that wouldn't work in all cases, would it? What if you did something like this?

cout << ((Polygon *)0x12345678)->area() << endl;

Assuming that you're allowed to access that random area of memory.

I would test this out but I can't on the computer I'm on at the moment.

(I hope I'm not missing something obvious...)

Offtopic: Why not vote up the other people who spent time writing helpful answers for you? — , Oct 14 '08 at 23:10

score 27 · Accepted Answer · answered Oct 14 '08 at 22:46

27

Each object (that belongs to a class with at least one virtual function) has a pointer, called a vptr. It points to the vtbl of its actual class (which each class with virtual functions has at least one of; possibly more than one for some multiple-inheritance scenarios).

The vtbl contains a bunch of pointers, one for each virtual function. So at runtime, the code just uses the object's vptr to locate the vtbl, and from there the address of the actual overridden function.

In your specific case, Polygon, Rectangle, and Triangle each has a vtbl, each with one entry pointing to its relevant area method. Your ppoly1 will have a vptr pointing to Rectangle's vtbl, and ppoly2 similarly with Triangle's vtbl. Hope this helps!

answered Oct 14 '08 at 22:46

C. K. Young

219,335
46
382
435

vptr/vtbl. Rally I don't remember those in the standard :-) Pointer to a vtable. Where a vtable is a compiler defined structure is more descriptive. – Martin York Oct 14 '08 at 22:51
@Martin: vptr/vtbl are the terms used in Bjarne Stroustrup's book, The C++ Programming Language. :-) – C. K. Young Oct 14 '08 at 22:54
I guess a vtable is not required by the standard, it just so happens that most compilers implement polymorphism using one so it has become more or less standard behaviour – 1800 INFORMATION Oct 14 '08 at 22:56
vtable is not in the standard either. This is just one implementation of possibly many. It happens to be the one that is pretty much universally used, but it's not mandated by the standard. (out of curiosity - is there an accessible compiler out there that does something else?) – Michael Burr Oct 14 '08 at 22:56
I agree it's an implementation detail, but it's one I've found actually helps people understand how virtual functions are meant to work. :-) – C. K. Young Oct 14 '08 at 22:58
Sorry not disagree with your explanation, what I was trying to say was: The lexeme 'vtbl' is not very discriptive and unless you have already heard the term vtable. When I discuss the subject with collegues we use the term vtable (nobody speaks of a vtbl). – Martin York Oct 14 '08 at 23:04
Pretty much all C++ and C++ like languages use vtables, I think. See here: http://en.wikipedia.org/wiki/Vtable - the section talks about languages with multiple dispatch needing something more advanced – 1800 INFORMATION Oct 15 '08 at 00:06
The other commonly used method is a hashtable of method names – 1800 INFORMATION Oct 15 '08 at 00:08
Ick! Not hashtable of method names! That sounds like IDispatch late binding (or whatever it's called, I'm referring to the one that looks up function pointers by method names, rather than DispIDs). Apologies if you've never played with OLE Automation and don't know what I'm talking about. :-P – C. K. Young Oct 15 '08 at 00:32

score 6 · Answer 2 · edited May 23 '17 at 10:29

6

Chris Jester-Young gives the basic answer to this question.

Wikipedia has a more in depth treatment.

If you want to know the full details for how this type of thing works (and for all type of inheritance, including multiple and virtual inheritance), one of the best resources is Stan Lippman's "Inside the C++ Object Model".

edited May 23 '17 at 10:29

Community

1
1

answered Oct 14 '08 at 22:52

Michael Burr

333,147
50
533
760

score 3 · Answer 3 · answered Oct 14 '08 at 22:46

Disregarding aspects of binding, it's not actually the compiler that determines this.

It is the C++ runtime that evaluates, via vtables and vpointers, what the derived object actually is at runtime.

I highly recommend Scott Meyer's book Effective C++ for good descriptions on how this is done.

Even covers how default parameters in a method in a derived class are ignored and any default parameters in a base class are still taken! That's binding.

score 1 · Answer 4 · answered Oct 14 '08 at 22:48

1

To answer the second part of your question: that address probably won't have a v-table in the right place, and madness will ensue. Also, it's undefined according to the standard.

answered Oct 14 '08 at 22:48

Zach Snow

1,014
8
16

score 1 · Answer 5 · answered Oct 14 '08 at 23:04

1

cout << ((Polygon *)0x12345678)->area() << endl;

This code is a disaster waiting to happen. The compiler will compile it all right but when it comes to run time, you will not be pointing to a valid v-table and if you are lucky the program will just crash.

In C++, you shouldn't use old C-style casts like this, you should use dynamic_cast like so:

Polygon *obj = dynamic_cast<Polygon *>(0x12345678)->area();
ASSERT(obj != NULL);

cout << obj->area() << endl;

dynamic_cast will return NULL if the given pointer is not a valid Polygon object so it will be trapped by the ASSERT.

answered Oct 14 '08 at 23:04

Adam Pierce

33,531
22
69
89

You can't dynamic_cast from an integer! (In fact you can't dynamic_cast from void* either, you have to start from a pointer/reference of a type that has some relation to the type you're casting to.) – C. K. Young Oct 15 '08 at 00:34
So my point is that with a random address like reinterpret_cast(0x12345678) you're in undefined behaviour zone no matter what. :-P – C. K. Young Oct 15 '08 at 00:35
I've actually done this kind of thing to store object pointers in a Windows list box. Admittedly I have to write it like this: MYTYPE *obj = dynamic_cast((MYTYPE *)listbox.GetItemData(item)); The dynamic_cast is a bit safer than a straight cast. – Adam Pierce Oct 15 '08 at 03:48
If GetItemData returns a void*, you can avoid C-style casts by using a static_cast instead. :-) Some people I know are quite dogmatic about avoiding C-style casts, because they can be quite a blunt tool (can cause reinterpret_cast in unintended cases, for example). – C. K. Young Oct 15 '08 at 09:22
On the other hand, if GetItemData returns an int, then reinterpret_casting it to a pointer is not 64-bit safe, in which case I'd probably use a map. :-) – C. K. Young Oct 15 '08 at 09:23

score 1 · Answer 6 · answered Oct 14 '08 at 23:06

Virtual function tables. To wit, both of your Polygon-derived objects have a virtual function table that contains function pointers to the implementations of all their (non-static) functions; and when you instantiate a Triangle, the virtual function pointer for the area() function points to the Triangle::area() function; when you instantiate a Rectangle, the area() function points to the Rectangle::area() function. Because virtual function pointers are stored along with the data for an object in memory, every time you reference that object as a Polygon, the appropriate area() for that object will be used.

How does the C++ compiler know which implementation of a virtual function to call?

6 Answers6

Linked

Related