58

cppreference states that:

Objects with trivial default constructors can be created by using reinterpret_cast on any suitably aligned storage, e.g. on memory allocated with std::malloc.

This implies that the following is well-defined code:

struct X { int x; };
alignas(X) char buffer[sizeof(X)];    // (A)
reinterpret_cast<X*>(buffer)->x = 42; // (B)

Three questions follow:

  1. Is that quote correct?
  2. If yes, at what point does the lifetime of the X begin? If on line (B), is it the cast itself that is considered acquiring storage? If on line (A), what if there were a branch between (A) and (B) that would conditionally construct an X or some other pod, Y?
  3. Does anything change between C++11 and C++1z in this regard?

Note that this is an old link. The wording was changed in response to this question. It now reads:

Unlike in C, however, objects with trivial default constructors cannot be created by simply reinterpreting suitably aligned storage, such as memory allocated with std::malloc: placement-new is required to formally introduce a new object and avoid potential undefined behavior.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • I actually tried to figure out the question of when lifetime begins of those objects. I was not able to find a definitive answer in standard, and I believe, it is vague in this regard. As for first question, I doubt the quote is correct, since there is an aliasing rule to pay attention to. – SergeyA Nov 29 '16 at 18:53
  • @SergeyA as long as the buffer is a char buffer, strict aliasing is not an issue. – Richard Hodges Nov 29 '16 at 19:01
  • 3
    No, and I thought we went over this multiple times already? [intro.object]/1 exhaustively enumerates which language constructs can create objects. – T.C. Nov 29 '16 at 19:13
  • 4
    @RichardHodges, nope. `char*` can alias *anything*, but *anything* can't alias `char*` – SergeyA Nov 29 '16 at 19:16
  • @SergeyA if that were true, it would not be allowable to alias the memory of a variant. – Richard Hodges Nov 29 '16 at 19:18
  • @T.C. Do you mind writing a good canonical answer for this? Help me, T.C., you're my only hope. – Barry Nov 29 '16 at 19:19
  • @RichardHodges, not sure what you mean by *variant* in this context. – SergeyA Nov 29 '16 at 19:23
  • @SergeyA `std::variant` or `boost::variant` for example. The storage can't be allocated with a union because there's no way to build a union from a type list. So you use a std::aligned_storage, which is simply an aligned char buffer that is at least as big and as aligned as the most restrictive type in the type list. – Richard Hodges Nov 29 '16 at 19:29
  • 4
    @RichardHodges Actually you can use a (recursive) union, and must use one if you want `constexpr`. – T.C. Nov 29 '16 at 19:30
  • @M.M That's because it just got fixed [a few minutes ago](http://en.cppreference.com/mwiki/index.php?title=cpp/language/default_constructor&diff=88595&oldid=86081) – T.C. Nov 29 '16 at 20:58
  • @M.M I fixed the question wording. – Barry Nov 29 '16 at 21:21

3 Answers3

38

There is no X object, living or otherwise, so pretending that there is one results in undefined behavior.

[intro.object]/1 spells out exhaustively when objects are created:

An object is created by a definition ([basic.def]), by a new-expression ([expr.new]), when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).

With the adoption of P0137R1, this paragraph is the definition of the term "object".

Is there a definition of an X object? No. Is there a new-expression? No. Is there a union? No. Is there a language construct in your code that creates a temporary X object? No.

Whatever [basic.life] says about the lifetime of an object with vacuous initialization is irrelevant. For that to apply, you have to have an object in the first place. You don't.

C++11 has roughly the same paragraph, but doesn't use it as the definition of "object". Nonetheless, the interpretation is the same. The alternative interpretation - treating [basic.life] as creating an object as soon as suitable storage is obtained - means that you are creating Schrödinger's objects*, which contradicts N3337 [intro.object]/6:

Two objects that are not bit-fields may have the same address if one is a subobject of the other, or if at least one is a base class subobject of zero size and they are of different types; otherwise, they shall have distinct addresses.


* Storage with the proper alignment and size for a type T is by definition storage with the proper alignment and size for every other type whose size and alignment requirements are equal to or less than those of T. Thus, that interpretation means that obtaining the storage simultaneously creates an infinite set of objects with different types in said storage, all having the same address.

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • So nothing in [basic.lval]/8 is relevant because there is no "type of the object" via which we're accessing because there's no object? – Barry Nov 29 '16 at 19:34
  • @Barry Well, there is an object - `buffer`, the `char` array. – T.C. Nov 29 '16 at 19:38
  • 2
    With the greatest respect (this is not tongue in cheek, your answers are very informative) if this were the case, then allocating an object via malloc would be undefined behaviour. Yet §3.8 explicitly allows it. There seems to be a disconnect in the wording of the standard. – Richard Hodges Nov 29 '16 at 19:45
  • 4
    @RichardHodges First, footnotes are non-normative. Second, that footnote pertains to the definition of "safely-derived pointer", which is completely unrelated - that concept is there for GC support. Third, it is fairly well-established that `malloc` alone is not sufficient to create an object under the current wording - P0137 explicitly refers to that as the status quo. – T.C. Nov 29 '16 at 19:50
  • @RichardHodges Where in §3.8? What version of the standard? – Yakk - Adam Nevraumont Nov 29 '16 at 19:56
  • 6
    @T.C. While I understand that, the inability to make using `auto p_int = (int*)malloc(sizeof(int));` defined behavior seems like a really bad idea. I get that making it defined behavior is hard, but the alternative is horrible. Reems and reems of legacy code gone from "in practice working" to "anathema". If the standard did not permit that, the standard was wrong; the way to fix it is to fix the standard, not make the standard's error more explicit. – Yakk - Adam Nevraumont Nov 29 '16 at 19:57
  • 1
    I think "it's there for GC" seems a little glib. The footnote specifically mentions "other languages" and "C". GC is not mentioned at all. If we accept the the X-pointer is not pointing to an X, but memory of the correct alignment and size to accomodate an X, and an X is a POD, then I struggle to see how the code can possibly be UB. This would make interfacing with C libraries UB. Patently it is not. The standard seems to be contradicting itself. – Richard Hodges Nov 29 '16 at 20:18
  • @Yakk I lifted it from N4527, pages 68-69 – Richard Hodges Nov 29 '16 at 20:20
  • 2
    @Yakk Well, that's undefined anyway for reading an uninitialized object :) The current state of affairs is certainly suboptimal - the formal object model makes `std::vector` unimplementable in standard C++ - but making it work is nontrivial. – T.C. Nov 29 '16 at 20:20
  • 3
    @RichardHodges Airlifting a footnote out of context doesn't help your case. That footnote is attached to [basic.stc.dynamic.safety]/2.1, and by happenstance [basic.life] started on the same page in that particular version of the working draft. "safely-derived pointer" is only relevant on implementations with *strict pointer safety* (aka GC'd implementations), which is an empty set AFAIK. It sheds absolutely zero light on the meaning of [basic.life], because it is dealing with a completely different subject. – T.C. Nov 29 '16 at 20:27
  • @T.C. I'm not sure that I have a case to make, other than the sure and certain knowledge that a POD that has been malloced and then cast is safe to use and will yield expected behaviour in all cases. This is the foundation of C interoperability. In addition, there is the std::aligned_storage et.al. which are specifically there to allow this kind of gerrymandering. It simply is not correct to say that an object can only be born by definition, new, union or temporary. It can also be forced into existence through these means. I'm not saying you are wrong - the standard is. – Richard Hodges Nov 29 '16 at 20:42
  • 14
    @T.C. I'll be explicit: `auto p_int = (int*)malloc(sizeof(int)); *p_int = 0; std::cout << *p_int << "\n";` -- anything that doesn't make that standards compliant should be a non-starter. That is legacy C-style memory handling, and it exists in massive legacy code bases that have compiled and worked in C++ for 30+ years. If the C++ standard says "that isn't defined", it is a flaw in the standard. I get *why* it is hard, but leaving it ambiguous or poorly worded is better than explicitly stating that is undefined. – Yakk - Adam Nevraumont Nov 29 '16 at 20:44
  • 3
    @Yakk The relevant wording has been around in every standard. It's of course a problem, but a long-standing one. – T.C. Nov 29 '16 at 20:51
  • 7
    @Yakk IMO having the standard be clear is better than having it be ambiguous or poorly worded. Then discussion can at least move onto fixing it instead of having endless threads like this where people apply their own interpretation and we argue about whose interpretation is [better | was the intent | etc.] – M.M Nov 29 '16 at 20:57
  • @T.c. prior to 1776, "by the implementation (12.2) when needed." left a lot of lattitude. "when needed". Other clauses refering to object lifetime would imply an object was needed. After 1776, object lifetime moment of creation was pinned down. Prior ambiguity on when an object actually exists meant that the standards was ambiguous about if `int_p` could be used; this change seems to make it explicitly illegal to use it as there is no object there. That seems wrong. Or am I reading it incorrectly? – Yakk - Adam Nevraumont Nov 29 '16 at 21:00
  • 3
    @M.M No; if it is unambiguously undefined behavior to use that `int_p`, some idiot on a compiler team might actually break code that uses it and get people on side (after all, the standard is clear!). If it requires convoluted reasoning that is ambiguously correct to justify the same, other people are more likely to smack them upside the head for being an idiot. Anything that "clarifies" that the `int_p` use is illegal is either changing the standard to be broken, or polishing a standard defect. – Yakk - Adam Nevraumont Nov 29 '16 at 21:01
  • @Yakk Not really, the cross-reference to 12.2 means it's only talking about the cases in that section ([class.temporary]). – T.C. Nov 29 '16 at 21:02
  • @Yakk compilers can offer extensions; if a compiler previously supported creating an object in this way,and now decides not to support that, that's a business decision on their part. Compilers are supposed to help their users to achieve programming goals, not break working code on purpose. – M.M Nov 29 '16 at 21:12
  • @M.M Yet that's what gcc 6 ended up doing with the null pointer check that broke Qt/Chromium. – Barry Nov 29 '16 at 21:15
  • 3
    @Barry Code that relies on the behaviour of dereferencing null pointers is a ticking timebomb, the problem would arise sooner or later anyway. I would argue that gcc never explicitly supported defined behaviour of dereferencing null pointers - what you got was just happenstance. In terms of the userbase, there's a conflict between those who want defined behaviour of dereferencing null, and those who don't want their code slowed down by runtime null pointer checks being inserted (etc.). But I don't see any similar conflict in this case. – M.M Nov 29 '16 at 21:30
  • 1
    @M.M Code that relies on UB is a ticking timebomb in general. I don't think there's anything in particular about one form of UB or another. – Barry Nov 29 '16 at 22:32
  • 1
    @GundolfGundelfinger I'd say that the behaviour of `mmap` (and the status of any memory "retured" by it) it is outside of what is covered by the standard. In a vacuum the compiler would have to assume that it might have had objects created correctly in it – M.M Dec 02 '16 at 01:17
  • 1
    Thank you for taking the time to update the answer. In the light of re-reading the draft standard, plus P0137 I have posted a new, extremely carefully worded, question - complete with compilable code. I would be truly grateful if you could give it a careful look. I believe an informed answer will be of benefit to the community. http://stackoverflow.com/questions/40930475/clarification-of-specifics-of-p0137 – Richard Hodges Dec 02 '16 at 13:18
  • "_Is there a language construct in your code that creates a temporary X object?_" yes, the definition of the buffer object. Anyway, this definition of an object is broken. – curiousguy Jan 13 '17 at 17:36
  • 1
    "_Thus, that interpretation means that obtaining the storage simultaneously creates an infinite set of objects with different types in said storage, all having the same address._" Yes, it does. Do you have a problem with that? – curiousguy Oct 21 '17 at 05:44
  • 1
    Please consider these four events: (1) an object comes into existence as per your answer ("you have to have an object") (2) an object is created as per intro.object (3) an object lifetime begins as per basic.life (4) storage suitable for an object is obtained as per basic.life. Which of these can be considered separate independent events? In what order are they sequenced? – n. m. could be an AI Nov 09 '17 at 11:59
  • 1
    @n.m.: 2 is the means by which 1 takes place. 4 happens before 3. So the order of operations is always 2, 4, 3. Now, 4&3 may happen simultaneously (acquiring the memory starts its lifetime), but you cannot start the lifetime of an object before you've acquired storage for it. After all, [basic.life]/1 says that vacuous initialization happens when you acquire storage for the object. So that has to already have happened. – Nicol Bolas Nov 24 '17 at 16:35
  • 2
    @T.C. It is now defined behavior in C++20. According to [[intro.object](https://eel.is/c++draft/intro.object#13)]/13, beginning a lifetime of an array of chars implicitly creates another object in the storage occupied by the array, provided that another object is of an implicit-lifetime type. It is time to update the answer. – kalaider May 22 '20 at 18:16
  • @kalaider: What must one do to cause a region of storage to revert to being an "array of chars", thus allowing the implicit creation of a new object within it? – supercat Jul 13 '22 at 23:03
7

Based on p0593r6 I believe the code in the OP is valid and should be well defined. The new wording, based on the DR retroactively applied to all versions from C++98 inclusive, allows implicitly object creation as long as the created object is well defined (tautology is sometimes the rescue for complicated definitions), see § 6.7.2.11 Object model [intro.object]):

implicitly-created objects whose address is the address of the start of the region of storage, and produce a pointer value that points to that object, if that value would result in the program having defined behavior [...]

See also: https://stackoverflow.com/a/61999151/2085626

Amir Kirsh
  • 12,564
  • 41
  • 74
3

This analysis is based on n4567, and uses section numbers from it.

§5.2.10/7: When a prvalue v of object pointer type is converted to the object pointer type “pointer to cv T”, the result is static_cast<cv T*>(static_cast<cv void*>(v)).

So, in this case, the reinterpret_cast<X*>(buffer) is the same as static_cast<X *>(static_cast<void *>(buffer)). That leads us to look at the relevant parts about static_cast:

§5.2.9/13: A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. The null pointer value is converted to the null pointer value of the destination type. If the original pointer value represents the address A of a byte in memory and A satisfies the alignment requirement of T, then the resulting pointer value represents the same address as the original pointer value, that is, A.

I believe that's enough to say that the original quote is sort of correct--this conversion gives defined results.

As to lifetime, it depends on what lifetime you're talking about. The cast creates a new object of pointer type--a temporary, which has a lifetime starting from the line where the cast is located, and ending whenever it goes out of scope. If you have two different conversions that happen conditionally, each pointer has a lifetime that starts from the location of the cast that created it.

Neither of these affects the lifetime of the object providing the underlying storage, which is still buffer, and has exactly the same lifetime, regardless of whether you create a pointer (of the same or converted type) to that storage or not.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • What's the conclusion though? Is your claim that the the pointer to `X` is created legally, but that it can't actually be dereferenced (e.g., the `->x` is UB) because they don't point to a created `X` object? It isn't clear to be the relevance of the lifetime of the pointers themselves and it's hard to understand on what side of the debate this answer comes down on. – BeeOnRope Oct 19 '17 at 22:12
  • 1
    Yes, creating the pointer has defined behavior, but dereferencing the pointer gives UB. I considered his question about lifetime somewhat ambiguous, so I pointed out the lifetime of every object in the code, even though I agree that the lifetime of the pointers themselves *probably* isn't what he cared about. He asked about the lifetime of the `X`, and there is no actual `X` involved, just a pointer to X initialized with the address of a buffer of `char`s. – Jerry Coffin Oct 19 '17 at 23:42
  • Right, but at the end, the code dereferences the pointer as if there _was_ an `X` - if that isn't going to work (the crux of the question, really), maybe point it out? – BeeOnRope Oct 19 '17 at 23:53
  • 1
    @BeeOnRope: I'm hesitant to say that. The reality is that it's officially undefined behavior, *but* it **will** work (for almost any reasonable definition of the word) on every known implementation, and I'd expect it to continue working essentially permanently. The simple fact is that breaking this breaks essentially all C compatibility, and I doubt there's even one compiler vendor that's willing to throw that away. – Jerry Coffin Oct 19 '17 at 23:56
  • 1
    Fair enough - it's that exact "problem" that caused me to come here, since I find it hard to believe (for example) that `memcpy`ing a trivially copyable type into suitable aligned uninitialized storage isn't allowed by the standard, but that seems to be the place we're in today :(. – BeeOnRope Oct 19 '17 at 23:59