11

This question originates from the comment section in this thread, and has also got an answer there. However, I think it is too important to be left in the comment section only. So I made this Q&A for it.

Placement new can be used to initialize objects at allocated storage, e.g.,

using vec_t = std::vector<int>;
auto p = (vec_t*)operator new(sizeof(vec_t));
new(p) vec_t{1, 2, 3}; // initialize a vec_t at p

According to cppref,

Placement new

If placement_params are provided, they are passed to the allocation function as additional arguments. Such allocation functions are known as "placement new", after the standard allocation function void* operator new(std::size_t, void*), which simply returns its second argument unchanged. This is used to construct objects in allocated storage [...]

That means new(p) vec_t{1, 2, 3} simply returns p, and p = new(p) vec_t{1, 2, 3} looks redundant. Is it really OK to ignore the return value?

Lingxi
  • 14,579
  • 2
  • 37
  • 93

1 Answers1

9

Ignoring the return value is not OK both pedantically and practically.

From a pedantic point of view

For p = new(p) T{...}, p qualifies as a pointer to an object created by a new-expression, which does not hold for new(p) T{...}, despite the fact that the value is the same. In the latter case, it only qualifies as pointer to an allocated storage.

The non-allocating global allocation function returns its argument with no side effect implied, but a new-expression (placement or not) always returns a pointer to the object it creates, even if it happens to use that allocation function.

Per cppref's description about the delete-expression (emphasis mine):

For the first (non-array) form, expression must be a pointer to a object type or a class type contextually implicitly convertible to such pointer, and its value must be either null or pointer to a non-array object created by a new-expression, or a pointer to a base subobject of a non-array object created by a new-expression. If expression is anything else, including if it is a pointer obtained by the array form of new-expression, the behavior is undefined.

Failing to p = new(p) T{...} therefore makes delete p undefined behavior.

From a practical point of view

Technically, without p = new(p) T{...}, p does not point to the newly-initialized T, despite the fact that the value (memory address) is the same. The compiler may therefore assume that p still refers to the T that was there before the placement new. Consider the code

p = new(p) T{...} // (1)
...
new(p) T{...} // (2)

Even after (2), the compiler may assume that p still refers to the old value initialized at (1), and make incorrect optimizations thereby. For example, if T had a const member, the compiler might cache its value at (1) and still use it even after (2).

p = new(p) T{...} effectively prohibits this assumption. Another way is to use std::launder(), but it is easier and cleaner to just assign the return value of placement new back to p.

Something you may do to avoid the pitfall

template <typename T, typename... Us>
void init(T*& p, Us&&... us) {
  p = new(p) T(std::forward<Us>(us)...);
}

template <typename T, typename... Us>
void list_init(T*& p, Us&&... us) {
  p = new(p) T{std::forward<Us>(us)...};
}

These function templates always set the pointer internally. With std::is_aggregate available since C++17, the solution can be improved by automatically choosing between () and {} syntax based on whether T is an aggregate type.

Lingxi
  • 14,579
  • 2
  • 37
  • 93
  • I'm by no means an expert at all this pointer alchemy, but nor am I convinced by the limited reasoning and lack of references to other discussions here. It doesn't stand to reason to me that what you quote, _'a pointer to a thing created by `new`'_ is necessarily the same thing as what you conclude, _'a pointer returned by `new`'_; I think the _"created by"_ grammatically binds to the object, not the pointer. If you didn't take the return value, `p` would still point to the object, since `new` by definition had to create the object at `p`. – underscore_d Mar 30 '18 at 10:17
  • 1
    @underscore_d You've got a point. We need a language lawyer here. I will update the answer to mention this concern. Hopefully, some expert could notice and explain. – Lingxi Mar 30 '18 at 11:22
  • My thinking also applies to your practical PoV; if all `new` does is return the same `p`, I can't immediately see a wording reason that assigning that back to itself changes semantics at all; it's not clear to me that this is needed in non-`launder`ing cases, nor a sufficient substitute for `launder` in cases where the latter would be needed. – underscore_d Mar 30 '18 at 15:46
  • 1
    @underscore_d As I understand it, the result of the placement `operator new` is unchanged, but the result of the `new` expression is a pointer to the newly-constructed object. See [[expr.new]/1](https://timsong-cpp.github.io/cppwp/expr.new#1.sentence-7). That should at least make it sufficient for the `launder`-needed case. – Daniel H Mar 30 '18 at 22:30
  • Arg, I messed up the link syntax and didn’t double-check in the time limit. At least it’s still understandable. – Daniel H Mar 30 '18 at 22:40
  • 3
    The non-allocating global allocation function returns its argument, but a *new-expression* always returns a pointer to the object it creates, even if it happens to use that allocation function. A pointer in the C++ abstract machine has [certain possible values](https://timsong-cpp.github.io/cppwp/basic.compound#3). Its value doesn't magically change [outside of certain limited cases not applicable here](https://timsong-cpp.github.io/cppwp/basic.life#8). The original `p` pointed to some allocated storage, not a `T` object, and cannot be used to access one. – T.C. Mar 31 '18 at 01:01
  • 1
    Practically speaking, if the allocation function is opaque and `T` has no const/reference members, the implementation may not be able to prove that `p` didn't actually point to a `T` object when returned from the `operator new` call, and since you could use the original pointer if it did point to such an object, you can probably get away with it in such a case. – T.C. Mar 31 '18 at 01:07
  • @T.C. How nice if the placement new could take a reference to pointer, and set the pointer internally before returning (something like [this](https://github.com/Lingxi-Li/lock_free/blob/025d10c5c36d4177d02a5b9b3222a0cdc9e81eb7/lf/memory.hpp#L27)). And then the pitfall is avoided. – Lingxi Mar 31 '18 at 02:30
  • @T.C. I have updated the answer with new information provided in the comment section. I'm not a language lawyer. As such, the words I wrote may be inaccurate or mis-leading. Feel free to edit whatever you see fit. – Lingxi Mar 31 '18 at 02:46
  • 1
    I'd like to point out that sometimes storing the pointer is impractical, so I think `launder` is probably a better solution there. For example, if the buffer is a class member, you can get a pointer to it by taking the address of that member. Storing the return value of placement `new` in another class member would be a waste of space. – nog642 Apr 15 '22 at 15:56