0

I have the following class hierarchy:

class IControl
{
    virtual void SomeMethod() = 0; // Just to make IControl polymorphic.
};

class ControlBase
{
public:
    virtual int GetType() = 0;
};

class ControlImpl : public ControlBase, public IControl
{
public:
    virtual void SomeMethod() { }

    virtual int GetType()
    {
        return 1;
    }
};

I have an IControl abstract class, and a ControlBase class. The ControlBase class does not inherit from IControl, but I know that every IControl-implementation will derive from ControlBase.

I have the following test code in which I cast an IControl-reference to ControlBase (because I know it derives from it) with dynamic_cast, and also with C-style cast:

int main()
{
    ControlImpl stb;
    IControl& control = stb;

    ControlBase& testCB1 = dynamic_cast<ControlBase&>(control);
    ControlBase& testCB2 = (ControlBase&)control;
    ControlBase* testCB3 = (ControlBase*)&control;

    std::cout << &testCB1 << std::endl;
    std::cout << &testCB2 << std::endl;
    std::cout << testCB3 << std::endl;
    std::cout << std::endl;
    std::cout << testCB1.GetType() << std::endl; // This properly prints "1".
    std::cout << testCB2.GetType() << std::endl; // This prints some random number.
    std::cout << testCB3->GetType() << std::endl; // This prints some random number.
}

Only the dynamic_cast works properly, the other two casts give back slightly different memory addresses, and the GetType() function gives back incorrect values.

What is the exact reason for this? Does the C-style cast end up using a reinterpret_cast? Is it related to how polymorphic objects are aligned in memory?

Mark Vincze
  • 7,737
  • 8
  • 42
  • 81
  • 2
    possible duplicate of [When should static\_cast, dynamic\_cast and reinterpret\_cast be used?](http://stackoverflow.com/questions/332030/when-should-static-cast-dynamic-cast-and-reinterpret-cast-be-used) – Captain Obvlious Aug 03 '13 at 18:54
  • 2
    I've seen that question and read the answers (and other answers about this topic). But for me it is still not clear why this happens in this concrete scenario, that's why I posted a separate question. – Mark Vincze Aug 03 '13 at 19:01
  • 1
    Note there's a difference between `ControlBase* testCB3 = static_cast(&control);` and `ControlBase* testCB4 = static_cast(&control);`. The latter doesn't invoke UB (and correctly produces a `1` for the `GetType()` test). – dyp Aug 03 '13 at 19:26
  • 1
    Just because you know that ControlBase and IControl appear together as base classes does not mean the compiler will deduce that fact. The standard does not say compilers need to figure this out. I think the dynamic_cast is working because the compiler knows "control" is really a ControlImpl, and if you passed the reference thru a function argument I do not think it would work. – brian beuning Aug 03 '13 at 19:30
  • 1
    I can not static_cast manually, this line: `ControlBase* testCB4 = static_cast(&control);` gives compiler error `error C2440: 'static_cast' : cannot convert from 'IControl *' to 'ControlBase *'`. That's why I thought that the C-style cast did a reinterpret_cast. – Mark Vincze Aug 03 '13 at 19:38
  • I understand that it has been explained that static_cast does not work when downcasting across virtual inheritance, but I still don't understand what goes on behind the scenes, and why do I end up with those slightly different memory addresses. – Mark Vincze Aug 03 '13 at 19:40
  • 1
    All the static pointer cast does, in the non-scalar case, is take a pointer and give it a different compile-time type, to make it assignable to a different variable type or to make it dereferenceable as a different type. No transformation or indirection of the actual runtime value is performed. (A cast of a scaler, from char to int, say, is an entirely different operation, unrelated to pointer casting.) – Hot Licks Aug 03 '13 at 19:58

2 Answers2

8

I think the class names in your example are a bit confusing. Let's call them Interface, Base and Impl. Note that Interface and Base are unrelated.

The C++ Standard defines the C-style cast, called "explicit type conversion (cast notation)" in [expr.cast]. You can (and maybe should) read that whole paragraph to know exactly how the C-style cast is defined. For the example in the OP, the following is sufficient:

A C-style can performs a conversion of one of [expr.cast]/4:

  • const_cast
  • static_cast
  • static_cast followed by const_cast
  • reinterpret_cast
  • reinterpret_cast followed by const_cast

The order of this list is important, because:

If a conversion can be interpreted in more than one of the ways listed above, the interpretation that appears first in the list is used, even if a cast resulting from that interpretation is ill-formed.

Let's examine your example

Impl impl;
Interface* pIntfc = &impl;
Base* pBase = (Base*)pIntfc;

A const_cast cannot be used, the next element in the list is a static_cast. But the classes Interface and Base are unrelated, therefore there is no static_cast that can convert from Interface* to Base*. Therefore, a reinterpret_cast is used.

Additional note: the actual answer to your question is: as there is no dynamic_cast in the list above, a C-style cast never behaves like a dynamic_cast.


How the actual address changes is not part of the definition of the C++ language, but we can make an example of how it could be implemented:

Each object of a class with at least one virtual function (inherited or own) contains (read: could contain, in this example) a pointer to a vtable. If it inherits virtual functions from multiple classes, it contains multiple pointers to vtables. Because of empty base class optimization (no data members), an instance of Impl could look like this:

+=Impl=======================================+
|                                            |
|  +-Base---------+   +-Interface---------+  |
|  | vtable_Base* |   | vtable_Interface* |  |
|  +--------------+   +-------------------+  |
|                                            |
+============================================+

Now, the example:

     Impl  impl;

     Impl* pImpl  = &impl;
Interface* pIntfc = pImpl;
     Base* pBase  = pImpl;
+=Impl=======================================+
|                                            |
|  +-Base---------+   +-Interface---------+  |
|  | vtable_Base* |   | vtable_Interface* |  |
|  +--------------+   +-------------------+  |
|  ^                  ^                      |
+==|==================|======================+
^  |                  |
|  +-- pBase          +-- pIntfc
|
+-- pimpl

If you instead do a reinterpret_cast, the result is implementation-defined, but it could result in something like this:

     Impl  impl;

     Impl* pImpl  = &impl;
Interface* pIntfc = pImpl;
     Base* pBase  = reinterpret_cast<Base*>(pIntfc);
+=Impl=======================================+
|                                            |
|  +-Base---------+   +-Interface---------+  |
|  | vtable_Base* |   | vtable_Interface* |  |
|  +--------------+   +-------------------+  |
|                     ^                      |
+=====================|======================+
^                     |
|                     +-- pIntfc
|                     |
+-- pimpl             +-- pBase

I.e. the address is unchanged, pBase points to the Interface subobject of the Impl object.

Note that dereferencing the pointer pBase takes us to UB-land already, the Standard doesn't specify what should happen. In this exemplary implementation, if you call pBase->GetType(), the vtable_Interface* is used, which contains the SomeMethod entry, and that function is called. This function doesn't return anything, so in this example, nasal demons are summoned and take over the world. Or some value is taken from the stack as a return value.

dyp
  • 38,334
  • 13
  • 112
  • 177
  • Thanks for the thorough answer, this makes a lot of sense! And in these lines: `Interface* pIntfc = pImpl;` or `Base* pBase = pImpl;`, we don't even have to use dynamic_cast, because we cast upwards the inheritance tree, which can be done implicitly, right? – Mark Vincze Aug 03 '13 at 21:51
  • Right: [conv.ptr]/3 "A prvalue of type “pointer to *cv* `D`”, where `D` is a class type, can be converted to a prvalue of type “pointer to *cv* `B`”, where `B` is a base class of `D`. If `B` is an inaccessible or ambiguous base class of `D`, a program that necessitates this conversion is ill-formed." – dyp Aug 03 '13 at 22:01
3

What is the exact reason for this?

The exact reason is that dynamic_cast is guaranteed to work in this situation by the standard, while the other kinds invoke undefined behaviour.

Does the C-style cast end up using a reinterpret_cast?

Yes, in this case it does. (A side note: never ever use a C-style cast).

Is it related to how polymorphic objects are aligned in memory?

I would say it is related to the way polymorphic objects that use multiple inheritance are laid out in memory. In a language with single inheritance, dynamic_cast would not be necessary, as the base subobject address would coincide with the derived object address. In the multiple-inheritance case this is not so, as there are more than one base subobjects, and different base subobjects must have different addresses.

Sometimes the compiler can calculate the offset between each subobjects address and the derived object address. If the offset is non-zero, the cast operation is then becomes a pointer addition or subtraction instead of a no-op. (In the case of virtual inheritance upcast, it's somewhat more complicated but the compiler can still do that).

There is at least two cases when the compiler cannot do that:

  1. Cross-cast (that is, between two classes neither of which is a base class of the other).
  2. Downcast from a virtual base.

In these cases dynamic_cast is the only way to cast.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243