12

Regarding this code:

#include <string>

int main()
{
    union u {
        u() { i = 0; }
        ~u() {}

        int i;
        std::string s1;
        std::string s2;
    } u;

    new (&u) std::string{};
}

[intro.object]/2 says that

Objects can contain other objects, called subobjects. A subobject can be a member subobject ([class.mem]), a base class subobject ([class.derived]), or an array element. An object that is not a subobject of any other object is called a complete object. If an object is created in storage associated with a member subobject or array element e (which may or may not be within its lifetime), the created object is a subobject of e's containing object if:
— the lifetime of e's containing object has begun and not ended, and
— the storage for the new object exactly overlays the storage location associated with e, and
— the new object is of the same type as e
(ignoring cv-qualification).

There is no requirement how an object is created in the storage associated with a member subobject. The code doesn't have to nominate the subobject in the argument of the address-of operator if the subobject is a member of a standard-layout union or the first member of a non-union class object. It is enough to get the address of the containing object to designate the storage of the member subobject in such cases.

«There is no requirement how an object is created», among other things, means that the pointer given to placement new does not have to point to the subobject. Mainly because there could be no object to point to (note, the [intro.object]/2 do not require subobject to be alive). In std-discussion mailing list it was asked, given an object x of type struct A { unsigned char buf[1]; };, is there a difference between new (&x) A{} and new (x.buf) A{}? And the answer was no, in both cases, x.buf would provide storage for A{}. Because

The wording in [intro.object] and [basic.life] concern themselves with the storage address represented by a pointer, not the object to which it points.


[class.union]/1 swears that «At most one of the non-static data members of an object of union type can be active at any time».

Which one became active in the code above, s1 or s2?

Language Lawyer
  • 3,378
  • 1
  • 12
  • 29
  • 1
    What is the reason you use placement new here, instead of just plain assignment to a member? Is it just plain curiosity, or is there some underlying problem? Or perhaps some existing code that uses this? – Some programmer dude Jan 17 '19 at 13:34
  • 1
    @Someprogrammerdude assignment does not start the lifetime of union members of `std::string` types. – Language Lawyer Jan 17 '19 at 13:36
  • Does the note in [class.union/1], cited above, apply? i.e. `s1` and `s2` have a common initial sequence which covers their entire sequence of data members, therefore they are indistinguishable? – Peter Hull Jan 17 '19 at 13:42
  • @PeterHull this note requires one of such standard-layout struct members to be active. But I don't know which one became active. Anyway, I can replace `std::string` with `double` and this note won't apply. – Language Lawyer Jan 17 '19 at 13:46
  • 1
    To see how to change the active member of a union see this: https://stackoverflow.com/questions/46349720/could-someone-explain-this-c-union-example – NathanOliver Jan 17 '19 at 13:51
  • Why do you think either `s1` or `s2` becomes active? It says "**at most** one ..." – xskxzr Jan 17 '19 at 14:08
  • @xskxzr good question. So, even though I've created an object in a storage associated with member subobject, its lifetime did not start? – Language Lawyer Jan 17 '19 at 14:11
  • No. Sometimes [a glvalue referring to old object may automatically refer to the new object](https://timsong-cpp.github.io/cppwp/n4659/basic.life#8), but that's not your case (the lifetime of `u.s1` or `u.s2` has even not begun), and even in that case, they are definitely two objects. – xskxzr Jan 17 '19 at 14:18
  • My first guess is "neither". – Bartek Banachewicz Jan 17 '19 at 14:26
  • @xskxzr so you claim that non-trivial types should not be members of unions, because you can't start their lifetime? And [this note](https://timsong-cpp.github.io/cppwp/n4659/class.union#6) is lying? – Language Lawyer Jan 17 '19 at 14:27
  • See [issue 1404](http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_active.html#1404). And I still think it is impossible to use a non-trivial data member if its lifetime has not ever begun. – xskxzr Jan 17 '19 at 14:37
  • Assuming you properly activate an std::string in the union don't both s1 and s2 become active because std::string and std::string, being the same type exactly, share a common prefix and accessing the common prefix parts of union members is well defined? – Goswin von Brederlow Jan 17 '19 at 14:37
  • @xskxzr 1404 is about recreating objects with const/reference subobjects. You told that since union member's name never referred to an existing object, it won't refer when you start the lifetime of this member. – Language Lawyer Jan 17 '19 at 14:53
  • 2
    Irrespective on how the dialogue developed here, this is still an extremely good question. I'm starting to think that the standard doesn't describe this situation adequately. Hopefully an expert wades in. @LanguageLawyer: do ping me in a couple of days if still no adequate answer: I'll put a bounty on the question. – Bathsheba Jan 17 '19 at 15:41
  • 1
    There was a similarly interesting question about placement new more generally last week (with respect to reusing storage). I think that whole feature is just really underspecified in a few places. – Lightness Races in Orbit Jan 17 '19 at 16:37
  • @LightnessRacesinOrbit The placement new feature is as old as C++ standardisation, yet the wording was changed significantly. (The non trivial type in a union is a more recent feature of course.) The wording re: unions member lifetime is also recent. It shows that **C++ spec is severely lack** on the basic stuff. – curiousguy Jan 17 '19 at 16:43
  • @Bathsheba you may put a bounty, but the Standard just does not have an answer to this question. – Language Lawyer Jan 20 '19 at 18:20

1 Answers1

8

A pointer is an address, but to the object model, it is more than an address. It points to a specific object at that address. Multiple objects can exist at a certain address, but that doesn't mean that pointers to any of those objects are simultaneously pointers to other objects at that address. Consider what [expr.unary.op]/1 says of pointer indirection:

the result is an lvalue referring to the object or function to which the expression points.

Not to "an object at that address"; it is an lvalue referring to the object being pointed to. So clearly, in the C++ object model, multiple objects can exist at the same address, but a specific pointer into that address does not point to all of those objects. It only points to one of them.

[expr.unary.op]/2 says "The result of the unary & operator is a pointer to its operand". Therefore, &u points to u, which is of type u (BTW, was it really necessary to name the object the same as the type?). &u does not point to u.i, u.s1 or u.s2. All of those are guaranteed to share the same address as &u, but &u itself only points to u.

So the question now becomes, what is the storage represented by &u? Well, per [intro.object]/1, we know that "An object occupies a region of storage". If &u points to the object u, that pointer must therefore represent the region of storage occupied by that object. Not the storage of any of its subobjects; it is the storage for that object. In its entirety.

Now, we get to new(&u) std::string{}. This expression creates an object of type std::string{}, within the storage represented by &u. That represents reusing the storage of the object u. Which in accord with [basic.life]/1.4, terminates the lifetime of u. Which terminates the lifetime of its active member subobject.

So the answer to your question is that neither becomes active, because the object u doesn't exist anymore.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • I know about pointer values. And I've already covered this in the question: _There is no requirement how an object is created in the storage associated with a member subobject_. Which means you don't need to «point to» a member subobject. Richard Smith agrees here: _The wording in [intro.object] and [basic.life] concern themselves with the storage address represented by a pointer, not the object to which it points_ https://groups.google.com/a/isocpp.org/d/msg/std-discussion/GHwA_pOc4CA/o1a_WlqQAAAJ – Language Lawyer Jan 17 '19 at 17:15
  • The most promising answer thus far in my opinion. I hope you don't mind my edit. I'm not convinced that @LanguageLawyer is confused: seems clued up to me. – Bathsheba Jan 17 '19 at 17:43
  • 3
    @LanguageLawyer: I don't agree with that interpretation of the standard. `[basic.life]` and everything about unions stops making sense under that interpretation, since by that reasoning, even `new(&u.s1) std::string` would apply to `u.s2`, so both subobjects would be activated, which is explicitly disallowed. So I choose the interpretation where the standard makes sense. – Nicol Bolas Jan 17 '19 at 17:52
  • @NicolBolas: That last comment makes perfect sense to me. I believe this answer to be correct. Have an upvote. – Bathsheba Jan 17 '19 at 17:54
  • 1
    _pointer must therefore represent the region of storage occupied by that object_ pointer value only [represents the address](https://timsong-cpp.github.io/cppwp/n4659/basic.compound#def:represents_the_address) of the first byte in storage. – Language Lawyer Jan 17 '19 at 17:56
  • @LanguageLawyer: I don't understand that. Would you mind explaining the relevance of that comment please? – Bathsheba Jan 17 '19 at 17:57
  • 1
    @LanguageLawyer: OK, but where exactly did "only" come from? That a pointer represents an address does not mean it doesn't represent more than that. Otherwise, as I pointed out, [expr.unary.op]/1 doesn't work. It should also be noted that it says the "value of the pointer", not the pointer itself. – Nicol Bolas Jan 17 '19 at 17:59
  • @Bathsheba I don't understand where the quoted comes from. The only thing about pointers representing something I know is representing the address of the first byte. – Language Lawyer Jan 17 '19 at 17:59
  • 1
    @NicolBolas given your interpretation, it is impossible to start lifetime of a member using `new(&u.s1) std::string`, because `&u.s1` does not point to an object. No object exists or ever existed there. – Language Lawyer Jan 17 '19 at 18:03
  • 2
    @LanguageLawyer: [basic.life]/6 explains how you can use pointers to things that are about to become objects but aren't in their lifetime yet, or used to be within their lifetime but aren't anymore. `u.s1` is an object outside of its lifetime, so it applies. That is, all of the union members are always objects, but they're not always within their lifetimes. Just look at [class.union]; it frequently talks about the members as being objects even if they aren't active. – Nicol Bolas Jan 17 '19 at 18:07
  • @Bathsheba _The most promising answer thus far in my opinion_ This could be a solution of the problem, but it requires changing the standard. I was thinking similarly about how to solve this. But I was not thinking in terms of pointer values, only about syntactically nominating a member. Like in assignment lifetime starting rules https://timsong-cpp.github.io/cppwp/n4659/class.union#5 – Language Lawyer Jan 17 '19 at 18:08
  • @NicolBolas given your interpretation, it becomes hard to use arrays of `unsigned char`/`std::byte` to provide storage. Because what is usually given to placement new, is a pointer to an array member (representing only the storage under this member, in your opinion), not a pointer to the whole array. See examples here https://timsong-cpp.github.io/cppwp/n4659/intro.object#3 – Language Lawyer Jan 17 '19 at 18:13
  • 1
    @LanguageLawyer: Except that [intro.object]/3 *explicitly* lays out how byte arrays provide storage. A pointer to the first element of a byte array is undeniably pointing to storage "associated with another object e of type “array of N unsigned char ” or of type “array of N std::byte”". So it counts as providing storage; the C++ object model knows when a pointer points into an array. – Nicol Bolas Jan 17 '19 at 18:21
  • @NicolBolas _[intro.object]/3 explicitly lays out how byte arrays provide storage_ It requires an object to be created in a storage associated with an array. If I give a pointer to an array member, then the storage it represents associated with this member, not the array. Right? – Language Lawyer Jan 17 '19 at 18:25
  • 1
    @LanguageLawyer: My point is this: a pointer points to an object. That object has storage. In the case of [intro.object]/3, pointing to an array element subobject points to a piece of storage from that array. Therefore, it is pointing to storage "associated with that array". It may only be pointing to one object in that storage, but the storage *itself* is associated with the array. It talks about "storage associated with" rather than "an array" precisely because arrays decay to pointers. – Nicol Bolas Jan 17 '19 at 18:33
  • 1
    @NicolBolas I understand why your POV could be attractive, but can't agree. Both `&u` and `&u.s1` represent the beginning of the same storage. And I don't see why if a storage of a subobject is associated with storage of the containing object, the storage of an object is not associated with the storage of its subobject. – Language Lawyer Jan 17 '19 at 18:39
  • 1
    Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/186879/discussion-between-nicol-bolas-and-language-lawyer). – Nicol Bolas Jan 17 '19 at 18:39