5

I noticed that if I use C style casting (or reinterpret_cast) in the code below, I get a segmentation fault exception but if I use a dynamic_cast, it is OK. Why is this? Since I'm sure that the pointer a is of type B because the Add method already makes sure the input is of type B.

Do I have to use dynamic_cast here even though I already guarantee that pointer a is of type B through my implementation?

Edit:

I do realize it's a bad practice in general to use C style casting (or reinterpret_cast). But for THIS particular case, why do they not work.

This has an application in practice because if class B is an interface and class D is forced to store type A pointer due to some reason. Dynamic cast is forced to be used here when the implementation already guarantees the type safety of the interface type.

#include <iostream>

using namespace std;

class A
{
    public:
    virtual ~A() = default;
};

class B
{
    public:
    virtual string F() = 0;
};

class C : public A, public B
{
    public:
    virtual ~C() = default;
    virtual string F() { return "C";}
};

class D
{
    public:

    D() : a(nullptr) {}

    void Add(B* b)
    {
        A* obj = dynamic_cast<A*>(b);
        if(obj != nullptr)
            a = obj;
    }

    B* Get()
    {
        return (B*)(a); // IF I USE DYNAMIC CAST HERE, IT'D BE OK
    }

    private:
    A* a;
};

int main()
{
    D d;
    d.Add(new C());

    B* b = d.Get();
    if(b != nullptr)
        cout << b->F();
}
curiousguy
  • 8,038
  • 2
  • 40
  • 58
jahithber
  • 99
  • 8
  • 1
    Because C-style cast just pretends its the `A*` object, where-as `dynamic_cast` does a check, which is returning `nullptr`, which you catch ... – ChrisMM Feb 07 '20 at 19:38
  • 2
    "if i use C style casting" - That's almost always a bad idea. It is difficult to know what transformation the cast will actually perform, and in the *worst* case it will degenerate into a `reinterpret_cast(const_cast` which is almost *never* what you want and pretty much guarantees undefined behaviour except for rare corner cases. Don't use C-style casts, ever! – Jesper Juhl Feb 07 '20 at 19:38
  • @JesperJuhl then could you explain why reinterpret_cast is a bad idea here? – jahithber Feb 07 '20 at 19:40
  • @ChrisMM It's not returning null in the posted code. It will work as intended when using dynamic_cast. The object is, in fact, a `B` derivation off `C`. The OP's question is why would that work, returning a non-null proper `B *`, but direct casting pukes. – WhozCraig Feb 07 '20 at 19:40
  • @ChrisMM please explain what you mean by "pretends". Since the object is an A type object, it doesn't need to "pretend", it just is. – jahithber Feb 07 '20 at 19:41
  • You may find this Q/A thread useful: [When should static_cast, dynamic_cast, const_cast and reinterpret_cast be used?](https://stackoverflow.com/q/332030/10871073). – Adrian Mole Feb 07 '20 at 19:46
  • May be read FAQ https://isocpp.org/wiki/faq/coding-standards#pointer-casts Actually whole FAQ is quite illuminating, I suppose. – Öö Tiib Feb 07 '20 at 19:49
  • Those answers "reinterpret_cast ...... turns one type directly into another — such as casting the value from one pointer to another," does NOT explain why this doesn't work. This is exactly what the code is doing, turning one pointer to another, but then why the converted pointer doesn't work? – jahithber Feb 07 '20 at 19:56
  • 1
    Because it is potentially a pointer to the wrong place. Looking for a good answer on how inheritance from multiple bases is laid out in memory. The basics is `C` Is an `A` and has all the `A` stuff in there as well as all the `B` stuff. `reinterpret_cast` takes a `C*` and uses it as a `B*`, but the `B` stuff may be in the `C` after the `A` stuff. You wind up trying to use the `A` stuff as if its `B` stuff. – user4581301 Feb 07 '20 at 20:02
  • 1
    Someone made a great answer complete with pictures earlier this week and I can't find it. Here is a DR. Dobbs article on it that should be helpful: https://www.drdobbs.com/cpp/multiple-inheritance-considered-useful/184402074 – user4581301 Feb 07 '20 at 20:07
  • @user4581301 that's an interesting read. – jahithber Feb 07 '20 at 20:15
  • @jahithber It doesn't work because there is no reason it would work, period. – curiousguy Feb 08 '20 at 21:30

4 Answers4

4

tl;dr: c-style casts are sneaky and can easily introduce bugs.

So what's happening in this expression?

class A
{
    public:
    virtual ~A() = default;
};

class B
{
    public:
    virtual string F() = 0;
};

B* Get()
{
    return (B*)(a);
}

Notice that A and B are unrelated.

What if you used a proper static_cast instead?

B* Get()
{
    return static_cast<B*>(a);
}

You'll then see a proper diagnostic:

error: invalid 'static_cast' from type 'A*' to type 'B*'
            return static_cast<B*>(a);
                   ^~~~~~~~~~~~~~~~~~

Oh no.

Indeed, c-style casts fallback with reinterpret_cast when a static one can't be done. So your code is equivalent to:

B* Get()
{
    return reinterpret_cast<B*>(a);
}

Which is not what you want. This is not the cast you're looking for.

The A subobject has a different address than the B subobject, mainly to make place for the vtable.

What exactly is reinterpret_cast doing here?

Not a lot, really. It just tell the compiler to intepret the memory address sent to it as another type. It only work if there the type you ask for has a lifetime at that address. In you case this is not true, there's a A object at that place, the B part of your object is elsewhere in memory.

A static cast will adjust the pointer to make sure it points to the right offset in memory for that type, and fail to compile if it can't compute the offset.

C* c = new C();
cout << c;
cout << "\n";

A* a = dynamic_cast<A*>(c);
cout << a;
cout << "\n";

B* b = dynamic_cast<B*>(c);
cout << b;
cout << "\n";

Will give you something similar:

0xbe3c20
0xbe3c20
0xbe3c28

What can you do then?

If you want to use static casts, you'll have to go through C since it's the only place the compiler can see the relationship between A and B:

B* Get()
{
    return static_cast<B*>(static_cast<C*>(a));
}

Or if you don't know if C is the runtime type of the object a is pointing to, then you must use dynamic_cast.

Guillaume Racicot
  • 39,621
  • 9
  • 77
  • 141
1

Please allow me to start by quoting a few lines of the question's code to establish context.

A* a;
return (B*)(a);

Why does a C-style cast fail?

When casting pointers (to objects), a C-style cast has the same functionality as a reinterpret_cast plus the ability to cast away const and volatile. See below for why reinterpret_cast fails.

Why does a reinterpret_cast fail?

A reinterpret_cast tells the compiler to treat the expression as if it had the new type. The same bit pattern is used, just interpreted differently. This is a problem when dealing with your compound object.

The object in question is of type C, which is derived from both A and B. Neither A nor B has objects of zero size, which is a key factor. (The classes may look empty, but since they have virtual functions, each object of those classes contains a pointer to a virtual function table.) Here is one possible layout, where we assume the size of a pointer is 8:

----------------------------------
| C : | A : pointer to A's table |  <-- Offset 0
|     | B : pointer to B's table |  <-- Offset 8
----------------------------------

Your code starts with a pointer to C, which eventually gets stored as a pointer to A. With the above picture, these addresses happen to be numerically equal. So far, so good. Then you take this address and tell the compiler to believe it is a pointer to B, even though the B sub-object is offset by 8 bytes. So when you go to call b->F(), the program looks up the address of F in the virtual function table of A! Even if that happens to yield a valid function pointer, you are looking at a segmentation fault if the signature of that function does not match that of B::F. (In other words, expect a crash.)

On a more pedantic note, since A and B are unrelated types, using the pointer produced by your cast results in undefined behavior. The above merely explains what typically happens in this case, but technically the standard would allow the outcome "my computer exploded".

Why does a dynamic_cast work?

In short, dynamic_cast will add 8 to the pointer at the key time. What you are attempting is known as a "sidecast", which is one of the things dynamic_cast is designed to do. (It's 5b in cppreference's explanation of dynamic_cast.) The dynamic_cast will recognize that what a points to is really of type C (the most-derived type) and that C has an unambiguous base of type B. So the cast calculates the difference between the offsets of A and B within objects of C, and adjusts the pointer. The offset of B is 8, while the offset of A is 0, so the pointer is adjusted by 8-0, resulting in a valid pointer to B.

Once the pointer to B actually points to an object of type B, calling a virtual function of B works.

Using static_cast to go from C* to B* works similarly, but if course, you don't have a C* to work with in this case.

JaMiT
  • 14,422
  • 4
  • 15
  • 31
1

C-style casts are very dangerous, which is why in c++ we have static_cast and reinterpret_cast, (as well as the dynamic-cast which is c++ only)

reinterpret_cast is equally as dangerous as c-style casts, and will simply take the address for your B* and give you the same address as an A*, NOT what you want.

static_cast requires that the source and destination types are related. You can't simply dynamic_cast a B* to an A* because they are unrelated. However it doesn't do any other checking, just applies a simple fixed mathematical rule to the address.

You could static_cast to C* and then to A*, and that would be legal and safe so long as you are certain that your object is a C, otherwise it will go horribly wrong, even if this other object has an A and a B element, if it has other elements as well, these two may be at different offsets, and the fixed math will give a wrong answer.

dynamic_cast effectively asks the object itself to help. It is hosted by the C* implementation which knows both A and B types. If it was a different implementation object , that object would resolve the appropriate answer.

Gem Taylor
  • 5,381
  • 1
  • 9
  • 27
0

I converted your code to a more minimalistic example by removing class D. This makes it easier to experiment with variations that will help to clarify what works and what doesn't.

In terms of exposition, I'll just say that reinterpret_cast is a low-level operation with very specific, and limited, utility. It exists for cases where you want to tell the compiler that you know something it doesn't know -- that a bit-wise value can be meaningfully reinterpreted as a different type. This can't be the case for your sideways-cast of class C since the multiple-inherited base types can't both exist at the same memory address.

#include <iostream>

using namespace std;

class A
{
    public:
    virtual ~A() {}
};

class B
{
    public:
    virtual string F() = 0;
};

class C : public A, public B
{
    public:
    virtual string F() { return "C";}
};

int main()
{

    C c;
    auto display = [&c](B* b) -> string
       {
       return ( dynamic_cast<C*>(b) == &c ) ? b->F() : "BROKEN";
       };

    B* b1 = static_cast<B*>(&c); // simple up conversion (cast optional)
    cout << "b1: " << display(b1) <<"\n";

    A* a2 = static_cast<A*>(&c); // simple up conversion (cast optional)
    B* b2 = dynamic_cast<B*>(a2); // dynamic sideways conversion
    cout << "b2: " << display(b2) <<"\n";

    A* a3 = reinterpret_cast<A*>(&c); // low-level conversion (happens to work)
    B* b3 = dynamic_cast<B*>(a3); // dynamic sideways conversion
    cout << "b3: " << display(b3) <<"\n";

    B* b4 = reinterpret_cast<B*>(&c); // low-level conversion (doesn't work)
    cout << "b4: " << display(b4) <<"\n";

    A* a5 = static_cast<A*>(&c);
    B* b5 = reinterpret_cast<B*>(&a5); // low-level conversion (doesn't work)
    cout << "b5: " << display(b5) << "\n";

}

Output

$ ./why-do-i-have-to-use-a-dynamic-cast-here.cpp 
b1: C
b2: C
b3: C
b4: BROKEN
Segmentation fault (core dumped)

The most common use-case for reinterpret_cast is a scenario where you want to manipulate or store values in terms of their internal representation -- even though this may be outside the scope of the formal C++ specification (and therefore not portable).

A familiar example would be reading data values from a binary file. std::istream::read requires a buffer of type char* so a reinterpret_cast is required in order to read into any other data type (as shown in the documented example).


Related: Cannot dynamic_cast sideways

Brent Bradburn
  • 51,587
  • 17
  • 154
  • 173