7

C++11 §3.8.1 declares that, for an object with a trivial destructor, I can end its lifespan by assigning to its storage. I am wondering if trivial destructors can prolong the object's lifespan and cause aliasing woes by "destroying an object" that I ended the lifespan of much earlier.

To start, something which I know is safe and alias-free

void* mem = malloc(sizeof(int));
int*  asInt = (int*)mem;
*asInt = 1; // the object '1' is now alive, trivial constructor + assignment
short*  asShort = (short*)mem;
*asShort = 2; // the object '1' ends its life, because I reassigned to its storage
              // the object '2' is now alive, trivial constructor + assignment
free(mem);    // the object '2' ends its life because its storage was released

Now, for something which is not so clear:

{
    int asInt = 3; // the object '3' is now alive, trivial constructor + assignment
    short* asShort = (short*)&asInt; // just creating a pointer
    *asShort = 4; // the object '3' ends its life, because I reassigned to its storage
                  // the object '4' is now alive, trivial constructor + assignment
    // implicitly, asInt->~int() gets called here, as a trivial destructor
}   // 'the object '4' ends its life, because its storage was released

§6.7.2 states that objects of automatic storage duration are destroyed at the end of the scope, indicating that the destructor gets called. If there is an int to destroy, *asShort = 2 is an aliasing violation because I am dereferencing a pointer of unrelated type. But if the integer's lifespan ended before *asShort = 2, then I am calling an int destructor on a short.

I see several competing sections regarding this:

§3.8.8 reads

If a program ends the lifetime of an object of type T with static (3.7.1), thread (3.7.2), or automatic (3.7.3) storage duration and if T has a non-trivial destructor,39 the program must ensure that an object of the original type occupies that same storage location when the implicit destructor call takes place; otherwise the behavior of the program is undefined.

The fact that they call out types T with non-trivial destructor as yielding undefined behavior seems, to me, to indicate that having a different type in that storage location with a trivial destructor is defined, but I couldn't find anywhere in the spec that defined that.

Such a definition would be easy if a trivial destructor was defined to be a noop, but there's remarkably little in the spec about them.

§6.7.3 indicates that goto's are allowed to jump into and out of scopes whose variables have trivial constructors and trivial destructors. This seems to suggest a pattern where trivial destructors are allowed to be skipped, but the earlier section from the spec on destroying objects at the end of the scope mentions none of this.

Finally, there's the sassy reading:

§3.8.1 indicates that I am allowed to start an object's lifespan any time I want, if its constructor is trivial. This seems to indicate that I could do something like

{
    int asInt = 3;
    short* asShort = (short*)&asInt;
    *asShort = 4; // the object '4' is now alive, trivial constructor + assignment
    // I declare that an object in the storage of &asInt of type int is
    // created with an undefined value.  Doing so reuses the space of
    // the object '4', ending its life.

    // implicitly, asInt->~int() gets called here, as a trivial destructor
}

The only one of these reading that seems to suggest any aliasing issues is §6.7.2 on its own. It seems like, when read as part of a whole spec, the trivial destructor should not affect the program in any way (though for various reasons). Does anyone know what happens in this situation?

Cort Ammon
  • 10,221
  • 31
  • 45
  • This issue arose as part of a related discussion on aliasing: http://stackoverflow.com/questions/18624449/shared-memory-buffers-in-c-without-violating-strict-aliasing-rules – Cort Ammon Sep 06 '13 at 22:15
  • Sorry for being dense, but what does this question have to do with aliasing? – Kerrek SB Sep 06 '13 at 22:27
  • @KerrekSB: Clearly `asInt` and `asShort` alias. – Ben Voigt Sep 06 '13 at 23:00
  • Depending on what the compiler does with the trivial destructor, I may have a case where simultaneously refer to the same location as a short and as an int, as opposed to it being an int, and then a short. – Cort Ammon Sep 06 '13 at 23:01
  • @BenVoigt: I can't see the connection. Aliasing is "accessing the stored value" of an object through a glvalue of mismatching type. This isn't happening here. One object's lifetime ends, and another one's begins. I agree that this is an awkward part of the standard, but I would like to see a clearer description of the perceived problem or contradiction in the standard. – Kerrek SB Sep 06 '13 at 23:25
  • @KerrekSB: No, what you describe is "type-punning".. [*Aliasing*](http://en.wikipedia.org/wiki/Pointer_aliasing) is simply two handles (pointer or reference doesn't matter) addressing overlapping memory regions. – Ben Voigt Sep 06 '13 at 23:34
  • @BenVoigt: Hm, I see. So what part of the C++ standard describes how and when this is a problem? – Kerrek SB Sep 06 '13 at 23:37
  • @GManNickG: I thought Ben said that's about type punning, not aliasing? – Kerrek SB Sep 07 '13 at 00:28
  • @KerrekSB: I bolded a section to make it more clear what I believe is the perceived problem. – Cort Ammon Sep 07 '13 at 00:36
  • @KerrekSB: Oh, I see. There's overlapping terminology here. Aliasing is indeed when two references or pointers refer to overlapping memory locations, the obvious case being something like `void foo(int&, int&); int x; foo(x, x);`; the two arguments will alias the same thing, no UB of course. The standard recognizes this terminology (in the context of optimization, for example). Type punning is not in the standard, but that's when two aliases (as previously defined) exist to the same object but through different types. The standard calls this UB, and the violation is that of *strict* aliasing. – GManNickG Sep 07 '13 at 00:52
  • @KerrekSB: GMan has explained it well. Of course, violations of strict aliasing are not the only ways that aliasing can cause problems. For example, you must not pass parameters to `memcpy` which alias, even if you respect the strict aliasing requirement. – Ben Voigt Sep 07 '13 at 01:27
  • Can someone define UB for me? I've seen that acronym quite a few times, but it's still not occuring to me what UB actually stands for – Cort Ammon Sep 07 '13 at 03:55
  • @CortAmmon: Undefined behavior. Essentially the behavior of your program can be anything, even for things which have an "obvious" definition. For example, reading from an uninitialized variable is undefined behavior. So in `void foo(bool b) { switch (b) { case true: return 0; case false: return 1; default: return -1; } } int main() { bool b; return foo(b); }`, the program could return 0, 1, or even -1 (or even 111605; anything goes!). It doesn't matter that `bool` is "obviously" either `true `or `false`, the default case is still possible and has happened in GCC with optimizations. – GManNickG Sep 07 '13 at 05:57
  • @BenVoigt: Right, I think we've been on the same page all along. What I fail to see is how `int x; short * p = *(short*)(&x); *p = 2;` causes any problems as far as the standard is concerned, whether it be aliasing, type punning or object-lifetime-wise. – Kerrek SB Sep 07 '13 at 10:49
  • I suppose a related question would be whether there is anything wrong with `((int*)malloc(sizeof(int))->~int();` In both cases, there is an int destructor being called on something that may not be an int object. – Cort Ammon Sep 07 '13 at 16:03
  • @Kerrek: That doesn't cause a write to `x`, as far as the compiler's dependency analysis is concerned. – Ben Voigt Sep 07 '13 at 19:27

2 Answers2

2

In your second code snippet:

{
    int asInt = 3; // the object '3' is now alive, trivial constructor + assignment
    short* asShort = (short*)&asInt; // just creating a pointer
    *asShort = 4; 
    // Violation of strict aliasing. Undefined behavior. End of.
}

The same applies to your first code snippet. It is not "safe", but it will generally work because (a) there's no particular reason for a compiler to be implemented such that it doesn't work, and (b) in practice compilers have to support at least a few violations of strict aliasing or else it would be impossible to implement the memory allocator using the compiler.

The thing that I know can and does provoke compilers to break this kind of code is if you read asInt afterwards, the DFA is allowed to "detect" that asInt is not modified (since it's modified only by the strict-alias violation, which is UB), and move the initialization of asInt after the write to *asShort. That's UB by either of our interpretations of the standard though -- in my interpretation because of the strict aliasing violation and in your interpretation because asInt is read after the end of its lifetime. So we're both happy for that not to work.

However I don't agree with your interpretation. If you consider that assigning to part of the storage of asInt ends the lifetime of asInt, then that's a direct contradiction of the statement that the lifetime of an automatic object is its scope. OK, so we might accept that this is an exception to the general rule. But that would mean that the following is not valid:

{
    int asInt = 0;
    unsigned char *asChar = (unsigned char*)&asInt;
    *asChar = 0; // I've assigned the storage, so I've ended the lifetime, right?
    std::cout << asInt; // using an object after end of lifetime, undefined behavior!
}

Except that the whole point of allowing unsigned char as an aliasing type (and of defining that all-bits-0 means "0" for integer types) is to make code like this work. So I'm very reluctant to make an interpretation of any part of the standard, which implies that this doesn't work.

Ben gives another interpretation in comments below, that the *asShort assignment simply doesn't end the lifetime of asInt.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • Can you expand on (b)? I can't think of a scenario, but it's probably just slipping my mind. – GManNickG Sep 06 '13 at 22:42
  • 1
    @GManNickG: Haven't checked C++11, but C++03 doesn't actually say that you can access a `char` object or array using an lvalue of type `int`, it only says the opposite. Maybe I'm mistaken, but I think that threatens the foundations. – Steve Jessop Sep 06 '13 at 22:53
  • 1
    The object lifetime rules are beautiful legalese which unfortunately contradict each other. The whole "an object with trivial initialization begins its lifetime as soon as storage (of appropriate size and alignment) is obtained" is especially problematic, because it says that all types with trivial initialization are spammed all over your memory space at all times. – Ben Voigt Sep 06 '13 at 22:55
  • @Steve: According to the lifetime rules, every char array of at least size `sizeof (int)` IS a living `int` object. – Ben Voigt Sep 06 '13 at 22:56
  • @BenVoigt: or maybe a conforming compiler has time-warping capabilities, so that only the one you actually use it as exists and has existed all along ;-) But if it's true that every big enough sequence of bytes is an `int`, then it's untrue that re-using the storage ends that lifetime. So there's still a statement in the standard that requires a bit of "sympathetic reading", the question is which one... – Steve Jessop Sep 06 '13 at 22:56
  • What if we replace that rule by "an object of type `T` with trivial initialization begins its lifetime whenever an lvalue of type (possibly *cv-qualified*) `T&` is formed in reference to storage of appropriate size and alignment". That would prevent the overlapping objects, and define exactly when the old object is killed off through reuse of storage, which currently is a very underdefined concept. – Ben Voigt Sep 06 '13 at 22:58
  • Oh, but that would also kill off aliasing using `char` and `unsigned char` :( – Ben Voigt Sep 06 '13 at 23:00
  • @SteveJessop: Oh interesting, never noticed (C++11 contains the same caveat). Though you could use placement new to get a `T` from the memory. – GManNickG Sep 06 '13 at 23:01
  • 1
    @BenVoigt: hmm, I think re-writing the standard is a bit above my pay grade, especially as I've been drinking. In an ideal world all of this code is legal: maybe the trick would be to say that trivially-destructible objects can be written to by a strict-alias-violating type, and thereafter hold an indeterminate value. Since that's what actually happens when compilers optimize -- if I overwrite a float via an `int*` and end up with the float initialization re-ordered after the `int` write, I never actually see those nasal demons, I just get unexpected values when viewing that memory. – Steve Jessop Sep 06 '13 at 23:03
  • In the case of char*, 3.10.10 indicates that I may access the value as a char* without causing undefined behavior. In this case, the int value could be treated as living the entire time, and being accessed through a char*, rather than creating a char at that memory location. However, I do conceed that there seems to be no syntax indicating which "reading" of your code the compiler is required to follow. I Intentionally used non-char cases in my example because 3.10.10 gives them so much leeway – Cort Ammon Sep 06 '13 at 23:07
  • Oh, and automatic objects do not live the life of the scope. Their storage exists for the entire scope, they are initialized at the start, and destroyed at the end. For non-trivial ctor or dtor, this means they have to live for the entire scope (or at least have to be of that type at dtor time...). For those with trivial ctor/dtor, there are other ways to start and end lifespans besides initialization and destruction, as per 3.8.1 – Cort Ammon Sep 06 '13 at 23:12
  • 3
    @CortAmmon: probably the answer to your actual question, about trivial destructors, is that they don't do anything and in particular they don't require an object to exist at the point they nominally are executed. This is as good as explicit in your quotation of 3.8.8: it says that non-trivially destructible types *do* require an object, so it can be understood that trivial ones don't. Standardese is sometimes complex, but we're allowed to assume that it is not deliberately misleading :-) – Steve Jessop Sep 06 '13 at 23:17
1

I cannot say I have all the answers, as this is a part of the standard that I have tried hard to digest and it is non-trivial (euphemism for really complicated). Still, since I disagree with the answer by Steve Jessop, here is my take.

void f() {
   alignas(alignof(int)) char buffer[sizeof(int)];
   int *ip = new (buffer) int(1);                 // 1
   std::cout << *ip << '\n';                      // 2
   short *sp = new (buffer) short(2);             // 3
   std::cout << *sp << '\n';                      // 4
}

The behavior of that function is well defined and guaranteed by the standard. There is no problem with the strict aliasing rules at all. The rules determine when it is safe to read the value written to a variable. In the code above, the read in [2] extracts the value written in [1] through an object of the same type. The assignment reuses the memory of the chars and terminates their lifetime, so an object of type int becomes over the space previously taken by the chars. The strict aliasing rules don't have a problem with that since the read is with a pointer of the same type. In [3], a short is written over the memory previously ocupied by the int, reusing the storage. The int is gone and a short starts its lifetime. Again the read in [4] is through a pointer of the same type that was used to store the value, and is perfectly fine by the aliasing rules.

The key at this point is the first sentence of the aliasing rules: 3.10/10 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

Regarding the lifetime of objects, and in particular when the lifetime of an object ends, the quote you provide is not complete. It is perfectly fine for a destructor not to run as long as the program does not depend on the destructor being run. This only matters to some extent, but I think it is important to make it clear. While not explicitly stated as such, the fact is that a trivial destructor is a no-op (this can be derived from the definition of a what a trivial destructor is).[See edit below]. The quote in 3.8/8 means that if you have an object with trivial destructor for example any of the fundamental types with static storage you can reuse the memory as shown above and this won't cause undefined behavior (by itself). The premise is that since the destructor for the type is trivial, it is a no-op and what is currently living on that location is not important for the program. (At this point, if what was stored over that location is trivial or if the program does not depend on its destructor being run the program will be well defined; if the program behavior depends on the destructor of the overwriting type to run, well, tough luck: UB)


Trivial destructor

The standard (C++11) defines a destructor as trivial in 12.4/5:

A destructor is trivial if it is not user-provided and if:

— the destructor is not virtual,

— all of the direct base classes of its class have trivial destructors, and

— for all of the non-static data members of its class that are of class type (or array thereof), each such class has a trivial destructor.

The requirements can be rewritten as: the destructor is implicitly defined and not virtual, none of the subobjects has a non-trivial destructor. The first requirement means that dynamic dispatch is not needed for the destructor call, and that makes the value of the vptr not needed to start the destruction chain.

An implicitly defined destructor won't do anything at all for any non-class type (fundamental types, enums), but will call the destructors of the class members and bases. This means that none of the data stored in the complete object will be touched by the destructors, since after all everything is composed of members of fundamental types. From this description it could seem that a trival destructor is a no-op since no data is touched. But that is not the case.

The detail that I misremembered is that the requirement is not that there are no virtual functions at all, but rather that the destructor is not virtual. So a type can have a virtual function and also a trivial destructor. The implication is that, at least conceptually, the destructor is not a no-op, since the vptr (or vptrs) present in the complete objects are updated during the chain of destruction as the type changes. Now, while a trivial destructor may conceptually not be a no-op, the only side effects of the evaluation of the destructor would be the modification of the vptrs, which is not visible, and thus following the as-if rule, the compiler can effectively make the trivial destructor a no-op (i.e. it can not generate any code at all), and that is what compilers actually do, that is, a trivial destructor won't have any generated code.

Community
  • 1
  • 1
David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • That answer makes a lot of sense, but do you happen to know what section they define a trivial destructor? I kept searching the spec and coming up dry. – Cort Ammon Sep 07 '13 at 03:54
  • @CortAmmon: I am reading the definition of what a trivial destructor is and it is not like what I remembered (also cross-checked with C++03). The above statement that a trivial destructor is a no-op is *false*, will update the answer with this info. – David Rodríguez - dribeas Sep 07 '13 at 04:01
  • @CortAmmon: Updated the answer with the *trivial destructor* info. – David Rodríguez - dribeas Sep 07 '13 at 04:41
  • I think you're (in effect) saying that my misreading of the standard is quite simple. I think that writing an object accesses its stored value ("write" being a form of "access", the other form being "read"). You think it doesn't. You may well be correct, but I'm too lazy to grep the standard for anything that might clarify. In particular, if there's a distinction between accessing a value and accessing the storage containing that value, then you're right that 3.10/10 refers only to the former, whereas writing is the latter. – Steve Jessop Sep 07 '13 at 08:43
  • 1
    Also: in your answer you carefully use placement new to ensure that you are unambiguously re-using the memory to create a new object with a new lifetime. Typical code doesn't do that, it does what the questioner's code does, so it would be useful to know whether your arguments apply in full to the questioner's code rather than just your own. – Steve Jessop Sep 07 '13 at 08:48
  • @Steve: Well, I meant to comment on the placement new, but then forgot. All this is in treading on a rope over a cliff, I wanted to avoid `reinterpret_cast`, since that is very loosely defined in the standard, but I believe that if you add to the code above another set through the `ip` pointer and a read through it (plain write, no placement new) it is still well defined. Regarding what *access* means, I do interpret that to be read, since it is not dealing with the variable but the value: *access the **stored value*** (not variable) sounds like *read* to me. – David Rodríguez - dribeas Sep 08 '13 at 02:02