Why introduce `std::launder` rather than have the compiler take care of it?

Question

I've just read

and frankly, I am left scratching my head.

Let's start with the second example in @NicolBolas' accepted answer:

aligned_storage<sizeof(int), alignof(int)>::type data; 
new(&data) int; 
int *p = std::launder(reinterpret_cast<int*>(&data)); 
[basic.life]/8 tells us that, if you allocate a new object in the storage of the old one, you cannot access the new object through pointers to the old. std::launder allows us to side-step that.

So, why not just change the language standard so that accessing data through a reinterpret_cast<int*>(&data) is valid/appropriate? In real life, money laundering is a way to hide reality from the law. But we don't have anything to hide - we're doing something perfectly legitimate here. So why can't the compiler just change it's behavior to its std::launder() behavior when it notices we're accessing data this way?

On to the first example:

X *p = new (&u.x) X {2};
Because X is trivial, we need not destroy the old object before creating a new one in its place, so this is perfectly legal code. The new object will have its n member be 2.

So tell me... what will u.x.n return?

The obvious answer will be 2. But that's wrong, because the compiler is allowed to assume that a truly const variable (not merely a const&, but an object variable declared const) will never change. But we just changed it.

So why not make the compiler not be allowed to make the assumption when we write this kind of code, accessing the constant field through the pointer?

Why is it reasonable to have this pseudo-function for punching a hole in formal language semantics, rather than setting the semantics to what they need to be depending on whether or not the code does something like in these examples?

I would guess that `launder` disabled optimizations, in a very small scope. — dyp, Feb 12 '21 at 18:08
"*So why not make the compiler not be allowed to make the assumption when we write this kind of code?*" You mean, besides the obvious fact that the compiler would have to generate sub-optimal code 99% of the time? Most people *don't* go around recreating objects in storage, so assuming that `u.x.n` doesn't change is entirely reasonable. — Nicol Bolas, Feb 12 '21 at 18:17
What's the issue with the `reinterpret_cast` on `aligned_storage` though, @NicolBolas ? — dyp, Feb 12 '21 at 18:19
@dyp: Because pointers are not just addresses; they point to a specific object, and casting types doesn't change what *object* they point to (usually). When you have a derived class and you cast a pointer to it to one of its base class types, you still have a pointer to the derived class. That's the object model foundation that virtual functions are built on (among other concepts). — Nicol Bolas, Feb 12 '21 at 18:24
@NicolBolas " the obvious fact that the compiler would have to generate sub-optimal code 99% of the time?" No it wouldn't - it would only identify those cases similar to the examples you gave, and in the 99% of the cases where they don't occur, it will do what it usually does. Edited the question accordingly. — einpoklum, Feb 12 '21 at 18:37
@NicolBolas After the placement-new, the pointer to `data` doesn't point to the `int`, it points to nothing? Hence we need to launder it (after type conversion to satisfy launder's requirements for some reason) to make it point to an object again? — dyp, Feb 12 '21 at 19:15
@einpoklum *"it would only identify those cases similar to the examples you gave"* I believe that's quite hard. You can do `struct foo{ int const x; }; foo f; some_func(&f); cout << f.x;` and the compiler now must assume that `some_func` has modified `f.x`. — dyp, Feb 12 '21 at 19:16
@dyp: Guess what? It has to assume that [already](https://godbolt.org/z/eEc7fq). — einpoklum, Feb 12 '21 at 19:25
`std::launder` doesn't punch a hole in language semantics. It creates language semantics that otherwise C++ never had. Which yes, leaves the question of why implementations aren't just always required to act as if pointers representing the same address always point at the same object. — aschepler, Feb 12 '21 at 19:34
@einpoklum: "*It has to assume that already.*" You're mistaking compiled code for what the standard *requires* to happen. There is nothing in the standard that explicitly requires compilers to assume anything of the sort. — Nicol Bolas, Feb 12 '21 at 19:38
@einpoklum Well now I'm quite bamboozled :D But at least, it disables devirtualization: https://godbolt.org/z/nzhjTT — dyp, Feb 12 '21 at 19:38
@NicolBolas: Are you saying that example is just a missed optimization? I was under the impression that a `const` qualifier on a value to a function are not a guarantee to the compiler, but a safety precaution to let the compiler prevent you, the author, from changing it. — einpoklum, Feb 12 '21 at 19:55
@einpoklum That depends on whether the `const` is present at the object definition. A function that gets a `const type&` reference certainly has no guarantees, but a `const int n = 3;` really is guaranteed not to change. Less sure about this example with a const member in a non-const struct, but I would think an optimization would be valid. — aschepler, Feb 12 '21 at 21:01
@aschepler: The meaning of a `const` qualifier in a top-level object is clear, but the meaning of such a qualifier in struct fields is less clear. It may be useful for a compiler to assume that a nested object won't be changed *except by replacing the parent object*, but const-qualified members would be of little use if they made it impossible to do anything with the parent object. — supercat, Feb 12 '21 at 22:44
@supercat The const-qualified member makes it UB to do certain unusual things with the parent object. There are still plenty of valid ways to use it. I'm not sure what your point is. — aschepler, Feb 12 '21 at 22:50

score 20 · Accepted Answer · edited Feb 12 '21 at 20:28

20

depending on whether or not the code does something like in these examples

Because the compiler cannot always know when data is being accessed "this way".

As things currently stand, the compiler is allowed to assume that, for the following code:

struct foo{ int const x; };

void some_func(foo*);

int bar() {
    foo f { 123 };
    some_func(&f);
    return f.x;
}

bar will always return 123. The compiler may generate code that actually accesses the object. But the object model does not require this. f.x is a const object (not a reference/pointer to const), and therefore it cannot be changed. And f is required to always name the same object (indeed, these are the parts of the standard you would have to change). Therefore, the value of f.x cannot be changed by any non-UB means.

Why is it reasonable to have this pseudo-function for punching a hole in formal language semantics

This was actually discussed. That paper brings up how long these issues have existed (ie: since C++03) and often optimizations made possible by this object model have been employed.

The proposal was rejected on the grounds that it would not actually fix the problem. From this trip report:

However, during discussion it came to light that the proposed alternative would not handle all affected scenarios (particularly scenarios where vtable pointers are in play), and it did not gain consensus.

The report doesn't go into any particular detail on the matter, and the discussions in question are not publicly available. But the proposal itself does point out that it wouldn't allow devirtualizing a second virtual function call, as the first call may have build a new object. So even P0532 would not make launder unnecessary, merely less necessary.

edited Feb 12 '21 at 20:28

einpoklum

118,144
57
340
684

answered Feb 12 '21 at 20:01

Nicol Bolas

449,505
63
781
982

1

But what if, inside `some_func()`, I use `const_cast` on the `x` field? – einpoklum Feb 12 '21 at 20:27
... and [both major compilers seem to disagree with your answer](https://godbolt.org/z/xEdnv4). That is, they _don't_ assume `f.x` is 123 after the call. I quite doubt that is due to a missed optimization opportunity in both (but of course I could theoretically be wrong). – einpoklum Feb 12 '21 at 20:40
2

re. your second comment, compilers missing an optimization is hardly grounds for suspicion of the standard. Other reasons for missed optimizations include: nobody ever put in work to add this case to the optimizer; and the userbase expecting non-standard behaviour. [Similar case](https://godbolt.org/z/8Ezs9f) without the struct gets optimized; structs with `const` members are uncommon – M.M Feb 12 '21 at 21:19
5

@einpoklum Modifying `f.x` during the lifetime of `x` (via a `const_cast`) is UB per [\[dcl.type.cv\]/4](https://timsong-cpp.github.io/cppwp/n4659/dcl.type.cv#4). Ending the lifetime of `x` and/or `f` within `some_func` would be allowed on its own, but then that would cause the sequenced-after access in `bar` to be UB by [\[basic.life\]/(8.3)](https://timsong-cpp.github.io/cppwp/n4659/basic.life#8). – aschepler Feb 12 '21 at 21:20
@M.M: Re my first comment - I meant, a `const_cast` followed by a modification of `f.x`. – einpoklum Feb 12 '21 at 21:25
1

@einpoklum: `f.x` is `const`. There is no way to modify it during its lifetime. It is absolutely immutable from the time it becomes full alive (for example, when any `f::f()` constructor returns) until it begins to die (for example, when `f::~f()` destructor begins execution). – Ben Voigt Feb 12 '21 at 21:46
1

@einpoklum: "*... and both major compilers seem to disagree with your answer.*" My answer is based on the standard, not the behavior of compilers. And the standard says that it would be UB, so the compilers' behavior is valid. – Nicol Bolas Feb 12 '21 at 21:51
2

@einpoklum: "*I meant, a const_cast followed by a modification of f.x.*" [dcl.type.cv]/4 doesn't have exceptions for the use of `const_cast`. Whether an object *is const* is not the same question as to whether a reference to an object is const. An object declared const is a `const` object, and per [dcl.type.cv]/4, *anything* that you do which would cause its value to be modified is automatically UB. – Nicol Bolas Feb 12 '21 at 21:52
@NicolBolas: If `some_func` were to overwrite all of the bytes of `f` with bytes copied from another object of type `struct foo`, would that not end the lifetime of `f` and create a new `struct foo` in its place? – supercat Feb 12 '21 at 22:14
@supercat: That would not end `f`'s lifetime. Indeed, sine `foo` is trivially copyable, it would have the effect of assigning one object to the other. Which violates [dcl.type.cv]/4. – Nicol Bolas Feb 12 '21 at 22:29
2

Fair enough. But given this situation, I would say `std::launder` is the opposite of aptly-named. `std::soil` would be much more appropriate. – einpoklum Feb 12 '21 at 22:50
@einpoklum Back in the 80's there was a tool called "C beautifier" that, if used, made your code look ... "bjoootiful". I wrote "C horrifier" as a response. No unnecessary whitespaces. Variables had "v0-100000" id:s etc. Perhaps ... `std::desecrate` ? – Ted Lyngmo Feb 12 '21 at 23:30
@TedLyngmo: Desecrating still sounds a bit honorable, albeit in an iconoclastic sort of a way. – einpoklum Feb 12 '21 at 23:34
@einpoklum :-) I think I'll just go ahead with my proposal then. – Ted Lyngmo Feb 12 '21 at 23:37
4

I think the example of "const subobject" is no longer relevant: https://github.com/cplusplus/draft/commit/fd8ff6441f93024bd0ee6e03a03c08be8e1b5ce0 – dyp Feb 15 '21 at 08:46

Why introduce `std::launder` rather than have the compiler take care of it?

1 Answers1

Linked