1

Probably this question was raised multiple times but I still cannot find any valid reasoned answer. Consider the following code piece:

struct A {virtual int vfunc() = 0;};
struct B {virtual ~B() {}};
struct C {void *cdata;};
//...
struct Z{};

struct Parent:
  public A,
  virtual B,
  private C,
  //...
  protected Z
{
  int data;
  virtual ~Parent(){}
  virtual int vfunc() {return 0;} // implements A::vfunc interface
  virtual void pvfunc() {};
  double func() {return 0.0;}
  //...etc
};

struct Child:
  public Parent
{
  virtual ~Child(){}
  int more_data;
  virtual int vfunc() {return 0;} // reimplements A::vfunc interface
  virtual void pvfunc() {};// implements Parent::pvfunc interface
};

template<class T>
struct Wrapper: public T 
{
 // do nothing, just empty
};

int main()
{
  Child ch;
  Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);
  wr.data = 100;
  wr.more_data = 200;
  wr.vfunc();
  //some more usage of wr...
  Parent pr = wr;

  pr.data == wr.data; // true?
  //...

  return 0;
}

Basically this shows a cast to reference to dummy child class Wrapper and usage of members its ancestors classes.

The question is: is this code valid by the standard? if not then what exactly does it violate?

PS: Do not provide answers like "this is wrong on so many levels omg" and similar please. I need exact quotes from the standard proving the point.

Alexander G.
  • 457
  • 2
  • 14
  • 2
    Is the beginning of your example, with multiple inheritance (public, virtual, private, protected), very specific to your question, or would your question have been the same with a simpler Parent/Child/Wrapper combination? – prog-fh Apr 01 '20 at 12:31
  • @prog-fh the example itself shows that there can be any of inheritance and virtual members in the base of ```class Wrapper```, so the question is about the general idea of casting to empty child-class reference and using this reference as if nothing unusual happened. – Alexander G. Apr 01 '20 at 12:43
  • Too lazy to check for sure, but I am close to believe it is wrong. [reinterpret_cast](https://en.cppreference.com/w/cpp/language/reinterpret_cast) does not perform any CPU instructions, just "interprets" the memory in a different way. You try to upcast a pointer that is of a wrong type, i.e. `dynamic_cast` will fail in this example. Even though, my compiler didn't have any issues with your code, standard [doesn't mention](https://stackoverflow.com/questions/21838292/empty-derived-optimization) anything about empty derived classes, that's why I think it's wrong in general. – pptaszni Apr 01 '20 at 13:10
  • @pptaszni so what? If the upcasting is wrong, or usage of upcasted reference is wrong, or downcasting is wrong then what exactly does it violate (with the quotes from standard please)? – Alexander G. Apr 01 '20 at 13:44
  • It’s really simple: the standard doesn’t define what your program would do when you do this. People implementing compilers can do whatever they feel like when they implement code generator that actually emits code for that C++ you wrote. If the pointer value stays within the function, optimizers may delete code that dereferences that cast pointer. Complex, practical template code may well depend on this optimization. I use a vendor clang version that does just that. Your code would compile to nothing. Just an example. – Kuba hasn't forgotten Monica Apr 01 '20 at 23:28

2 Answers2

3

I surely hope this is something you are doing as an academic exercise. Please do not ever write any real code that resembles any of this in any way. I can't possibly point out all the issues with this snippet of code as there are issues with just about everything in here.

However, to answer the real question - this is complete undefined behavior. In C++17, it is section 8.2.10 [expr.reinterpret.cast]. Use the phrase in the brackets to get the relevant section for previous standards.


EDIT I thought a succinct answer would suffice, but more details have been requested. I will not mention the other code issues, because they will just muddy the water.

There are several key issues here. Let's focus on the reinterpret_cast.

Child ch;
Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);

Most of the wording in the spec uses pointers, so based on 8.2.10/11, we will change the example code slightly to this.

Child ch;
Wrapper<Child> *wr = reinterpret_cast<Wrapper<Child>*>(&ch);

Here is the quoted part of the standard for this justification.

A glvalue expression of type T1 can be cast to the type “reference to T2” if an expression of type “pointer to T1” can be explicitly converted to the type “pointer to T2” using a reinterpret_cast. The result refers to the same object as the source glvalue, but with the specified type. [ Note: That is, for lvalues, a reference cast reinterpret_cast(x) has the same effect as the conversion *reinterpret_cast(&x) with the built-in & and * operators (and similarly for reinterpret_cast(x)). — end note ] No temporary is created, no copy is made, and constructors (15.1) or conversion functions (15.3) are not called.

One subtle little part of the standard is 6.9.2/4 which allows for certain special cases for treating a pointer to one object as if it were pointing to an object of a different type.

Two objects a and b are pointer-interconvertible if:

(4.1) — they are the same object, or

(4.2) - one is a standard-layout union object and the other is a non-static data member of that object (12.3), or

(4.3) — one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, the first base class subobject of that object (12.2), or

(4.4) — there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer- interconvertible.

If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast (8.2.10). [ Note: An array object and its first element are not pointer-interconvertible, even though they have the same address. — end note ]

However, your case does not meet this criteria, so we can't use this exception to treat a pointer to Child as if it were a pointer to Wrapper<Child>.

We will ignore the stuff about reinterpret_cast that does not deal with casting between two pointer types, since this case just deals with pointer types.

Note the last sentence of 8.2.10/1

Conversions that can be performed explicitly using reinterpret_cast are listed below. No other conversion can be performed explicitly using reinterpret_cast.

There are 10 paragraphs that follow.

Paragraph 2 says reinterpret_cast can't cast away constness. Not our concern.

Paragraph 3 says that the result may or may not produce a different representation.

Paragraphs 4 and 5 are about casting between pointers and integral types.

Paragraph 6 is about casting function pointers.

Paragraph 8 is about converting between function pointers and object pointers.

Paragraph 9 is about converting null pointer values.

Paragraph 10 is about converting between member pointers.

Paragraph 11 is quoted above and basically says that casting references is akin to casting pointers.

That leaves paragraph 7, which states.

An object pointer can be explicitly converted to an object pointer of a different type.73 When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast(static_cast(v)). [ Note: Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. — end note ]

This means that we can cast back and forth between those two pointer types all day long. However, that's all we can safely do. You are doing more than that, and yes, there are a few exceptions that allow for some other things.

Here is 6.10/8

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

(8.1) — the dynamic type of the object,

(8.2) — a cv-qualified version of the dynamic type of the object,

(8.3) — a type similar (as defined in 7.5) to the dynamic type of the object,

(8.4) — a type that is the signed or unsigned type corresponding to the dynamic type of the object,

(8.5) — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

(8.6) — an aggregate or union type that includes one of the aforementioned types among its elements or non- static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

(8.7) — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

(8.8) — a char, unsigned char, or std::byte type.

You case does not satisfy any of those.

In your case, you are taking a pointer to one type, and forcing the compiler to pretend that it is pointing to a different type. Does not matter how much the two look to your eyes - did you know that a completely standard conforming compiler does not have to put data for a derived class after the data for a base class? Those details are NOT part of the C++ standard, but part of the ABI your compiler implements.

In fact, there are very few cases where using reinterpret_cast for anything other than carrying a pointer around and then casting it back to its original type that does not elicit undefined behavior.

Jody Hagins
  • 27,943
  • 6
  • 58
  • 87
  • It is obvious that the code snippet is tricky but the real question is what exactly is wrong with it? Could you provide the exact quotes from the standard to prove your point please? – Alexander G. Apr 01 '20 at 13:38
  • 1
    The code snippet is not tricky at all. Read the section I referenced. It is very obvious. – Jody Hagins Apr 01 '20 at 18:53
  • @AlexanderG. Update with some more information, which is hopefully enough. – Jody Hagins Apr 01 '20 at 21:04
  • That's a great answer now. Thank you! "did you know that a completely standard conforming compiler does not have to put data for a derived class after the data for a base class?" - exactly, but do any compiler really have any additional internal data for derived class with no actual data members (except for v-table pointer that would be identical in that case) ? Anyway thanks for detailed answer, it seems that by standard (assuming c++17) this case is going shady after any member of parent class are called with '.' – Alexander G. Apr 05 '20 at 18:29
2

As stated in another answer, this discussion relates to section 8.2.10 [expr.reinterpret.cast] of the C++17 standard.

Sentence 11 of this section explains that for references to objects we can have the same reasoning as for pointers to objects.

Wrapper<Child> &wr = reinterpret_cast<Wrapper<Child>&/**/>(ch);
or
Wrapper<Child> *wr = reinterpret_cast<Wrapper<Child>*/**/>(&ch);

Sentence 7 of this section explains that for pointers to objects reinterpret_cast can be seen as two static_cast in sequence (through void *).

In the specific case of this question, the type Wrapper<Child> actually inherits from Child, so a single static_cast should be sufficient (no need for two static_cast, nor reinterpret_cast).

So if reinterpret_cast can be seen here as the combination of a useless static_cast through void * and a correct static_cast, it should be considered equivalent to this correct static_cast.


hum...

On second thought, I think I'm totally wrong!
(the static_cast is incorrect, I have read it the wrong way)

If we had

Wrapper<Child> wc=...
Child *pc=&wc;
Wrapper<Child> *pwc=static_cast<Wrapper<Child>*>(pc);

the static_cast (then the reinterpret_cast) would be correct because it goes back to the original type.

But in your example the original the original type was not Wrapper<Child> but Child.
Even if it is very unlikely, nothing forbids the compiler to add some hidden data members in Wrapper<Child>.
Wrapper<Child> is not an empty structure, it participates in a hierarchy with dynamic polymorphism, and any solution could be used under the hood by the compiler.
So, after reinterpret_cast, it becomes undefined behavior because the address stored in the pointer (or reference) will point to some bytes with the layout of Child but the following code will use these bytes with the layout of Wrapper<Child> which may be different.

prog-fh
  • 13,492
  • 1
  • 15
  • 30
  • Yeah...so what about the issues there? Why such upcasting is wrong, or usage of upcasting reference is wrong, or downcasting after upcasting is wrong? – Alexander G. Apr 01 '20 at 14:53
  • @AlexanderG. I tried to explain better, but I feel more and more confuse, sorry... – prog-fh Apr 01 '20 at 15:15