10

In the general case, it is (a very well deserved) Undefined Behavior to dowcast from a (dynamic) Base to one of the deriving classes Derived

The obvious UB

class Base
{
public:
    virtual void foo()
    { /* does something */ }

    int a;
}

class Derived : public Base
{
public:
    virtual void foo()
    { /* does something different */ }

    double b;
}

Base obj;
Derived derObj = *static_cast<Derived *>(&obj);  // <- here come the demons

In the current implementation approach of compilers, here there would obviously be at least the problems of inconsistent values in the Vtable and b containing garbage values. So it makes sense the standard does not define the behavior of a downcast in those conditions.

The not so obvious naive case

Yet I was curious to know if there were some concessions to this rule in specific cases ? For an example :

class Base
{
public:
    void foo()
    { /* does something */ }

    int a = 1;
    double b = 2.;
}

class DerivedForInt : public Base
{
    int getVal()
    { return a }
}

Base obj;
DerivedForInt derObj = *static_cast<DerivedForInt *>(&obj);  // <- still an UB ?

Here we can easily imagine compiler doing the right thing. But from the standard perspective, is it still undefined ?

Edit : static_cast is a random choice for illustration purpose, it is also interesting if working with other casts !

Ad N
  • 7,930
  • 6
  • 36
  • 80
  • 3
    It's still undefined behaviour. `obj` isn't a `DerivedForInt`. – Simple Nov 28 '13 at 10:47
  • @Simple It would make a lot of sense indeed that the standard did not make any exception. If you post it as an answer (with possibly some references confirming it / standard extract), it would make for a very nice accepted answer ; ) – Ad N Nov 28 '13 at 10:57
  • 2
    Ok, despite having 2 answers stating it has to be undefined behaviour, I have a slightly different opinion. As long as both of your classes have standard layout and your derived class does not add new fields, this should be actually working. At least you could reinterpret_cast according to these sources: http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special/7189821#7189821 http://stackoverflow.com/questions/8864311/pods-and-inheritance-in-c11-does-the-address-of-the-struct-address-of-the or am I missing a point? – user1781290 Nov 28 '13 at 11:07
  • @user1781290 That is very interesting ! It deserves to stand in a full blown answer if you ask me ; ) – Ad N Nov 28 '13 at 11:08

3 Answers3

8

Ok, I'll probably get shred into pieces for this answer...

Obviously, as the other answers stated this is undefined behaviour, as found in the standard. But if your Base class has standard layout and your DerivedForInt class does not add new data members it will have the same (standard) layout.

Under these conditions your cast should cause no troubles even it being UB. According to one of the sources, it is at least safe to do a

DerivedForInt *derived = reinterpret_cast<DerivedForInt*>(&base.a);

Sources:

What are Aggregates and PODs and how/why are they special?

PODs and inheritance in C++11. Does the address of the struct == address of the first member?

From the second link:

Here's the definition, from the standard section 9 [class]:

A standard-layout class is a class that:

  • has no non-static data members of type non-standard-layout class (or array of such types) or reference,
  • has no virtual functions (10.3) and no virtual base classes (10.1),
  • has the same access control (Clause 11) for all non-static data members,
  • has no non-standard-layout base classes,
  • either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
  • has no base classes of the same type as the first non-static data member.

And the property you want is then guaranteed (section 9.2 [class.mem]):

A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.

This is actually better than the old requirement, because the ability to reinterpret_cast isn't lost by adding non-trivial constructors and/or destructor.

Community
  • 1
  • 1
user1781290
  • 2,674
  • 22
  • 26
  • Thank you for taking the risk ; ) I actually believe your answer could lead to a constructive discussion ! To help us, could you please extract a citation from your sources that allows you to make the statement regarding safeness of the `reinterpret_cast` ? – Ad N Nov 28 '13 at 11:33
  • Brave answer, and definitely one which adds to the discussion. Now, while I see what you mean, my issue with this cast is that I don't see a situation where it is relevant. The only reason to prefer that derived class over the base would be to use the member functions defined in the derived class. If this is what you want, using the base class and pretending you don't seems to hint at a design flaw which most likely could be rectified. – Agentlien Nov 28 '13 at 11:34
  • @Agentlien It is definitely not a common case. It could be interesting to derive from a library class and add some accessors, I asume – user1781290 Nov 28 '13 at 11:36
  • @user1781290 Yes, I considered that. However, in that case I think I'd prefer either using composition or defining the functions as free functions, rather than member functions. – Agentlien Nov 28 '13 at 11:37
  • Citation is edited in now. @Agentlien That is probably a question of taste in most cases? – user1781290 Nov 28 '13 at 11:51
  • From the citation you provided, the only thing that I can see that would be defined is : `Base base; int *a = reinterpret_cast(base)` and then (the vice versa) `Base *b = reinterpret_cast(a)`. I do not see anything regarding class inheritance. – Ad N Nov 28 '13 at 12:05
  • That is correct. But if your `Base` is standard layout-conformant, your `Derived` will also be if you don't add new data members (as it does not violate any of the points for standard layout). Therefore both classes should have the same memory layout, making the access possible – user1781290 Nov 28 '13 at 12:10
  • @Agentlien Composing or free functions do not give you access to protected members, which inheritance does (I think that is what is meant by 'adding accessors'). But this is definitely not our use case : we are trying to do things in this direction to try and emulate polymorphism but with value semantic. Your proposition for a conversion ctor in `Derived` taking a `Base` could be what we actually need. – Ad N Nov 28 '13 at 12:14
  • @user1781290 Indeed, I was missing the central part of the reasoning, regarding the standard-layout's implications ! (minor note : I think you forgot to take the address of the `a` member in your code example, and SO would not let me make an edit with a single char difference). – Ad N Nov 28 '13 at 12:20
  • Note that this relies on the use of `reinterpret_cast` instead of `static_cast` which the question assumed. – MSalters Nov 28 '13 at 17:14
  • @MSalters Because of the standard-layout of both classes, `static_cast` should work as well – user1781290 Nov 28 '13 at 18:25
  • @user1781290 why there should be UB if both `Base` and `DerivedForInt` are standard-layout? – Konstantin Oznobihin Dec 11 '13 at 12:12
  • @KonstantinOznobihin According to the other answers, this is UB by the C++ standard. But I cannot see a case where it would not work – user1781290 Dec 11 '13 at 12:46
  • @user1781290 oh, I see, these are quotations from C++2003 standard (which doesn't have a notion of standard-layout type, BTW), but there should be no UB for C++11. – Konstantin Oznobihin Dec 11 '13 at 13:04
3

n3376 5.2.9/11

A prvalue of type “pointer to cv1 B,” where B is a class type, can be converted to a prvalue of type “pointer to cv2 D,” where D is a class derived (Clause 10) from B if a valid standard conversion from “pointer to D” to “pointer to B” exists (4.10), cv2 is the same cv-qualification as, or greater cv-qualification than, cv1, and B is neither a virtual base class of D nor a base class of a virtual base class of D. The null pointer value (4.10) is converted to the null pointer value of the destination type.

If the prvalue of type “pointer to cv1 B” points to a B that is actually a subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.

Since &obj is not points to DerivedForInt it's UB.
ForEveR
  • 55,233
  • 2
  • 119
  • 133
2

This is still undefined behaviour and I believe it should be.

Why it is undefined

As provided by @ForEveR in his answer:

n3376 5.2.9/11

A prvalue of type “pointer to cv1 B,” where B is a class type, can be converted to a prvalue of type “pointer to cv2 D,” where D is a class derived (Clause 10) from B

...

If the prvalue of type “pointer to cv1 B” points to a B that is actually a subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the result of the cast is undefined.

Why it should be undefined

It would only work for POD types, since adding a virtual function to your base is enough for this to hurt you in all compilers I know of. Also, the difference between types may be conceptual, not just in their data layout. Type safety is just as much about providing strong abstractions as it is about preventing issues with data representation.

If you would like something like this, it seems better to provide it as an ordinary function or to add a constructor in the derived class which takes an instance of the base class.

Community
  • 1
  • 1
Agentlien
  • 4,996
  • 1
  • 16
  • 27