7

I saw a presentation on cppcon of Piotr Padlewski saying that the following is undefined behaviour:

int test(Base* a){
  int sum = 0;
  sum += a->foo();
  sum += a->foo();
  return sum;
}

int Base::foo(){
  new (this) Derived;
  return 1;
}

Note: Assume sizeof(Base) == sizeof(Derived) and foo is virtual.

Obviously this is bad, but I'm interested in WHY it is UB. I do understand the UB on accessing a realloced pointer but he says, that this is the same.

Related questions: Is `new (this) MyClass();` undefined behaviour after directly calling the destructor? where it says "ok if no exceptions" Is it valid to directly call a (virtual) destructor? Where it says new (this) MyClass(); results in UB. (contrary to the above question)

C++ Is constructing object twice using placement new undefined behaviour? it says:

A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.

which again sounds like it is ok.

I found another description of the placement new in Placement new and assignment of class with const member

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

  • the storage for the new object exactly overlays the storage location which the original object occupied, and

  • the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and

  • the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and

  • the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).

This seems to explain the UB. But is really true?

Doesn't this mean, that I could not have a std::vector<Base>? Because I assume due to its pre-allocation std::vector must rely on placement-news and explicit ctors. And point 4 requires it to be the most-derived type which Base clearly isn't.

Flamefire
  • 5,313
  • 3
  • 35
  • 70
  • 2
    In `std::vector` `Base` is definitely the most derived. There cannot be `Base`s that are actually `Derived`s. In your `new (this) Derived;` does `this` actually have enough memory to hold a `Derived`? – nwp Feb 09 '18 at 13:58
  • What happens if the `Derived` has extra members (it's `sizeof` is larger)? That should give you an idea of what might happen when you do this placement new – YePhIcK Feb 09 '18 at 14:09
  • Also consider `foo` being a virtual method. The second call to `foo()` would actually call `Derived::foo` . – Markus Kull Feb 09 '18 at 15:01
  • Assume same sizes and virtual foo. I added that now. I expect `Derived::foo` to be called but it is UB – Flamefire Feb 09 '18 at 16:49

2 Answers2

3

I believe Elizabeth Barret Browning said it best. Let me count the ways.

  1. If Base isn't trivially destructible, we're failing to cleanup resources.
  2. If sizeof(Derived) is larger than the size of the dynamic type of this, we're going to clobber other memory.
  3. If Base isn't the first subobject of Derived, then the storage for the new object won't exactly overlay the original storage, and you'd also end up clobbering other memory.
  4. If Derived is just a different type from the initial dynamic type, even if it's the same size, than the object that we're calling foo() on cannot be used to refer to the new object. The same is true if any of the members of Base or Derived are const qualified or are references. You'd need to std::launder any external pointers/references.

However, if sizeof(Base) == sizeof(Derived), and Derived is trivially destructible, Base is the first subobject of Derived, and you only actually have Derived objects... this is fine.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • `std::launder` is still needed. – xskxzr Feb 09 '18 at 16:22
  • 1 & 2: assume that this does not apply as it is obviously wrong. 3: same sizes -> why does the storage layout matter? 4. I don't really get that. The old object is destroyed. All is left is some memory blob in undefined state. I add a new object to this blob properly initializing it via ctor call. Why is this UB? Ideas: a) CPU execution order (Out of Order) "caches" const values but you could `cast` them away anyway, right? b) Compiler optimization caches the old `foo` pointer/data and reuses it. So what does `std::launder` do exactly? And what is "different type from the initial dynamic type" – Flamefire Feb 09 '18 at 16:58
  • 1
    @Flamefire a) You can cast away const, but you cannot use the resulting lvalue to access the object value. b) This is exactly that, std::launder tell the compiler not to use this cached value. See http://eel.is/c++draft/intro.object#6 for most derived object class. – Oliv Feb 09 '18 at 23:20
  • so `struct A{const int a_; A(int a):a_(a){} }; A a(1); const_cast(a.a_) = 2;` is UB? – Flamefire Feb 11 '18 at 08:48
  • @Flamefire Yes. – Barry Feb 11 '18 at 14:38
2

Regarding your question

...Because I assume due to its pre-allocation std::vector must rely on placement-news and explicit ctors. And point 4 requires it to be the most-derived type which Base clearly isn't. And point 4 requires it to be the most-derived type which Base clearly isn't.

, I think the misunderstanding comes from the term "most derived object" or "most derived type":

The "most derived type" of an object of class type is the class with which the object was instantiated, regardless of whether this class has further subclasses or not. Consider the following program:

struct A {
    virtual void foo() { cout << "A" << endl; };
};

struct B : public A {
    virtual void foo() { cout << "B" << endl; };
};

struct C : public B {
    virtual void foo() { cout << "C" << endl; };
};

int main() {

    B b;  // b is-a B, but it also is-an A (referred  to as a base object of b).
          // The most derived class of b is, however, B, and not A and not C.
}

When you now create a vector<B>, then the elements of this vector will be instances of class B, and so the most derived type of the elements will always be B, and not C (or Derived) in your case.

Hope this brings some light in.

Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58