Every major C++ implementation uses a vtable. This is a pointer to an array (or struct) of method pointers. There is one vtable for a given type (excluding some corner cases involving dynamic loading).
If you write this:
class Iswitch {
public:
virtual void turn_on() = 0;
virtual void turn_off() = 0;
virtual ~Iswitch() = default;
};
the compiler creates something quite similar to this:
struct switch_vtable {
void(*turn_on)(void*) = 0;
void(*turn_off)(void*) = 0;
void(*dtor)(void*) = 0;
};
at the assembly level produced by C++ compilers, an array of packed function pointers and a structure of packed function pointers is identical. Here I'll use a struct so I don't have to mess with casting.
struct Iswitch {
switch_vtable const* vtable = 0;
Iswitch() {};
void dynamic_destroy() { vtable->dtor(this); }
void turn_on() { vtable->turn_on(this); }
void turn_off() { vtable->turn_off(this); }
};
When you inherit from Iswitch
:
class MySwitch:public Iswitch {
std::string message;
public:
void turn_on() final { std::cout << message << " turns on\n"; }
void turn_off() final { std::cout << message << " turns off\n"; }
};
the compiler generates an output similar to:
struct MySwitch:public Iswitch {
static MySwitch* from_pvoid(void* self) {
return static_cast<MySwitch*>(static_cast<Iswitch*>(self));
}
static switch_vtable make_MySwitch_vtable() {
return {
[](void* self){ from_pvoid(self)->turn_on_impl(); },
[](void* self){ from_pvoid(self)->turn_off_impl(); },
[](void* self){ from_pvoid(self)->~MySwitch(); }
};
}
static switch_vtable const* get_MySwitch_vtable() {
static const auto vtable = make_MySwitch_vtable();
return &vtable;
}
std::string message;
void turn_on_impl() { std::cout << message << " turns on\n"; }
void turn_off_impl() { std::cout << message << " turns off\n"; }
MySwitch() { vtable = get_MySwitch_vtable(); }
};
When you create the derived object, it sets the vtable pointer in the base object to point to its vtable as part of its constructor.
Then, when you access a method via virtual dispatch (usually just calling it), code looks up the method pointer in the vtable and invokes it.
If you have
void off_and_on( Iswitch* pswitch ) {
pswitch->turn_off();
pswitch->turn_on();
}
the code that is run (in this function) is identical regardless of which derived class of Iswitch
the pswitch
points at; at least until it gets to the body of turn_on
or turn_off
.
MySwitch bob;
on_and_off(&bob);
this does fundamentally the same thing in the class
"compiler does it for me" version of Iswitch
and MySwitch
as it does in the manual struct
version.
dynamic_destroy
is a special helper I added. When you delete a C++ object via pointer with a virtual destructor, it looks up the destructor in the vtable and uses it to clean up the object.
Iswitch* pswitch = new MySwitch;
delete pswitch;
in the manual struct
case, the above becomes:
// Iswitch* pswitch = new MySwitch; translates to:
Iswitch* pswitch = ::new( malloc( sizeof(MySwitch) ) ) MySwitch{};
// (with an extra try-catch to clean up the malloc memory if the ctor throws)
// delete pswitch; translates to:
pswitch->dynamic_destroy();
free(pswitch);
// (usually dtors are nothrow, so no try-catch needed here)
Here you can see the two versions compiled side by side.
In the real generated code there are a few differences.
- I created helper functions, while the real generated code inlines them.
- The calling convention for vtable functions is typically different than normal functions.
- gcc generates 2 destructor helpers in the vtable; one destroys, the other destroys and frees memory. I just destroyed.
- There is RTTI (run time type information) in the vtable geneated by gcc. This is needed by stuff like dynamic cast.
In addition, if you inherit using the virtual
things get more interesting, because now your vtable has pointers to vtables in it instead of just methods.
If we extend the Iswitch
interface, we just get a larger array/vtable struct, where the first part of the extended vtable matches the base class.
class Iswitch_extended:public Iswitch {
virtual bool is_on() const = 0;
};
corresponds to:
struct Iswitch_extended_vtable:Iswitch_vtable {
bool(*is_on)(void const*)=0;
};
struct Iswitch_extended:public Iswitch {
Iswitch_extended_vtable const* get_vtable() const { return static_cast<Iswitch_extended_vtable const*>(vtable); }
bool is_on() const { return get_vtable()->is_on(this); }
};
and the classes implementing it have to point to a full Iswitch_extended
table with the vtable
.
All of these techniques existed in C code bases prior to C++ being specified. When Bjarn was designing C++, he had this stuff in mind; the idea is that writing this boilerplate was annoying, and having the compiler write it for you was nice.
The downside is that there is more than one way to do all of this. While every major compiler used the above technique, MFC used a switch-based dispatch mechanism for its polymorphism.
The problem with the above vtable approach is that every concrete class needs a O(number of virtual methods) table. In MFC's case, the number of methods that needs to be possibly overridden is large (almost every windows message!). So instead of a huge function table, the message ID is passed to dispatch function, which is daisy chained. If a subclass wants to intercept that message on this object, it stops the chain and returns a function pointer (or executes the message directly on the function pointer).
But, with a fixed generator of dynamic dispatch code built into the language, the alternative approaches are far more expensive, and get neglected, even if they would be better for a specific use case.
I, for one, am looking forward to reflection. With reflection we'll be able to generate the dynamic dispatch code as efficiently as the compiler does, but target a different choice of OO model.