48

Consider this simple hierarchy:

class Base { public: virtual ~Base() { } };
class Derived : public Base { };

Trying to downcast Base* p to Derived* is possible using dynamic_cast<Derived*>(p). I used to think dynamic_cast works by comparing the vtable pointer in p to the one in a Derived object.

But what if we derive another class from Derived? We now have:

class Derived2 : public Derived { };

In this case:

Base* base = new Derived2;
Derived* derived = dynamic_cast<Derived*>(base);

We still get a successful downcast, even though the vtable pointer in Derived2 has nothing to do with a vtable pointer in Derived.

How does it actually work? How can the dynamic_cast know whether Derived2 was derived from Derived (what if Derived was declared in a different library)?

I am looking for specific details about how this actually works (preferably in GCC, but others are fine too). This question is not a duplicate of this question (which doesn't specify how it actually works).

Community
  • 1
  • 1
Avidan Borisov
  • 3,235
  • 4
  • 24
  • 27
  • 2
    It might be implemented differently in different compilers, to be sure you might want to read the source of them… – PlasmaHH Aug 21 '13 at 14:14

3 Answers3

36

How can the dynamic_cast know whether Derived2 was derived from Derived (what if Derived was declared in a different library)?

The answer to that is surprisingly simple: dynamic_cast can know this by keeping this knowledge around.

When the compiler generates code it keeps around the data about the class hierarchies in some sort of table that dynamic_cast can look up later. That table can be attached to the vtable pointer for easy lookup by the dynamic_cast implementation. The data neeeded for typeid for those classes can also be stored along with those.

If libraries are involved, this sort of thing usually requires these type information structures to be exposed in the libraries, just like with functions. It is possible, for example, to get a linker error that looks like "Undefined reference to 'vtable for XXX'" (and boy, are those annoying!), again, just like with functions.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
  • There are some complications when multiple-inheritance is involved. See for example [this question](http://stackoverflow.com/questions/5712808). – BlueRaja - Danny Pflughoeft Aug 21 '13 at 18:29
  • Although not standardized, in the case of the Itanium ABI the knowledge is integrated in RTTI structures that are pointed to by vtables. https://itanium-cxx-abi.github.io/cxx-abi/abi.html#rtti – jmcarter9t Jun 15 '22 at 23:56
22

Magic.

Just kidding. If you really want to research this in detail, the code that implements it for GCC is in libsupc++, a part of libstdc++.

https://github.com/mirrors/gcc/tree/master/libstdc%2B%2B-v3/libsupc%2B%2B

Specifically, look for all files with tinfo or type_info in their name.

Or read the description here, that's probably a lot more accessible:

https://itanium-cxx-abi.github.io/cxx-abi/abi.html#rtti

This details the format of the type information the compiler generates and should give you clues how the runtime support then finds the right casting path.

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
  • 4
    Thanks! Of interest in particular is the section *2.9.5* which contains the description of the RTTI structure (aka v-table) and how the hierarchy is encoded (and of course *2.9.7* which describes the algorithm itself). – Matthieu M. Aug 21 '13 at 14:52
5

How can the dynamic_cast know whether Derived2 was derived from Derived (what if Derived was declared in a different library)?

The dynamic_cast itself does not know anything, its the compiler that knows those facts. A vtable does not necessarily contain only pointers to virtual functions.

Here's how I would (naively) do it: my vtable will contain pointer(s) to some type information (RTTI) used by dynamic_cast. The RTTI for a type will contain pointers to base classes, so I can go up the class hierarchy. Pseudocode for the cast would look like this:

Base* base = new Derived2; //base->vptr[RTTI_index] points to RTTI_of(Derived2)

//dynamic_cast<Derived*>(base):
RTTI* pRTTI = base->vptr[RTTI_index];
while (pRTTI && *pRTTI != RTTI_of(Derived))
{
  pRTTI = pRTTI->ParentRTTI;
}
if (pRTTI) return (Derived*)(base);
return NULL;
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
Arne Mertz
  • 24,171
  • 3
  • 51
  • 90
  • 4
    Quite naive, indeed, as it fails to account for multiple base classes, virtual base classes (those are not fun) and offset adaptation. The basic idea is good though: the object topology is encoded in its V-table. – Matthieu M. Aug 21 '13 at 14:48
  • @MatthieuM. thats what I wanted: depict the basic idea. I had the other stuff in mind (except virtual base classes, screw them!) but didn't bother writing tons of useless pseudocode ;-) – Arne Mertz Aug 21 '13 at 14:52
  • For a reference implementation, I found a surprisingly quite readable one in [libcxxrt](https://github.com/pathscale/libcxxrt/blob/master/src/dynamic_cast.cc), especially clear in combination with the link from [Sebastian Redl](http://mentorembedded.github.io/cxx-abi/abi.html#rtti) aka the layout of the RTTI structure in section 2.9.5 of the Itanium ABI. – Matthieu M. Aug 21 '13 at 14:58
  • @MatthieuM. Quite clever as well. The idea is in fact similar to the one presented here, but implemented using polymorphic recursion to support different types of classes derived from __class_type_info. Thanks for the link! – Avidan Borisov Aug 22 '13 at 08:52