2

I am curious about how the following situation is interpreted under current C++ standard, especially in respect to lifetimes etc. Is it undefined behaviour?

First, lets start with the following definition: a relocatable object is an object which is invariant on its actual memory location — that is, its state stays the same regardless of the value of the pointer this. Assume that we have a relocatable type Relocatable (its definition is irrelevant for the example).

Then we have the following code (C++17):

typedef std::aligned_storage_t<sizeof(Relocatable)> Storage;

// construct an instance of a relocatable within a storage
auto storage0 = new Storage();
new(storage0) Relocatable(...);

{ 
  // obj is a valid reference
  // should use std::launder() here, but clang doesn't have it yet
  Relocatable& obj = *reinterpret_cast<Relocatable*>(storage0);
}

// move the storage
auto storage1 = new Storage();
memcpy(storage1, storage0, sizeof(Storage));
delete storage0;

{ 
  // ?????? what does the standard say about this?
  Relocatable& obj = *reinterpret_cast<Relocatable*>(storage1);
}

This works with both GCC and Clang as expected (the object simply continues to exist in the new storage). However, I am not entirely sure whether the standard is ok with this. Technically, the lifetime of the object has not ended (destructor has been not called) and there hasn't been any access to the object in the old location after the memcpy() call. Also, there exist no references/pointers to the old location. Still, given that C++ seems to treat object identity and object storage as the same thing most of the time, there might be a reason why this is prohibited. Thanks in advance for all the insightful comments.

Edit: It has been suggested that Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable? is a duplicate of this questions. I am not sure it is. First of all, I am memcpying the storage, not the object instance. Second, std::is_trivially_copyable<Relocatable>::value actually evaluates to true for all practically relevant applications.

P.S. There is actually a good practical reason why I am asking this. Sometimes it is useful to have objects which can only exist within their container — they are not copyable and not moveable. For instance, I am currently designing an optimized tree data structure with such a properties — tree nodes can only exist within the tree storage, they can't be moved out or copied — all operations on them are carried out via short-lived references. To prevent programmer mistakes (accidental copies/moves), I am deleting both the copy and the move constructor. Which has the rather unfortunate consequence that the nodes can't be stored within a std::vector. Placement new and explicitly managed storage can be used to bypass this limitation — but of course I wouldn't want to do something that is not ok according to the standard.

MrMobster
  • 1,851
  • 16
  • 25
  • 1
    Possible duplicate of [Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable?](https://stackoverflow.com/questions/29777492/why-would-the-behavior-of-stdmemcpy-be-undefined-for-objects-that-are-not-triv) – Passer By Mar 15 '18 at 11:13
  • 1
    There is no such thing as relocatable in C++ (for now anyways). There is only trivially copyable. Destructors are also not a necessary condition for an object's lifetime to end – Passer By Mar 15 '18 at 11:13
  • *An object is a region of storage*. Yes, the standard treats object address as its identity. The object does not continue to exist at a new address, it is impossible, anything at a new address is a new object. Why is this ever interesting or important? Your tree nodes cannot live in a resizable array if there are pointers to them in other nodes, because a resizable array moves its contents around and pointers don't track their pointees. This is true regardless of language or what fine semantics you ascribe to notions of moving and copying, – n. m. could be an AI Mar 15 '18 at 11:39
  • @n.m: of course they can live in a resizable storage. Its obvious I wouldn't use actual pointers — node references are stored as compressed offsets (also improves performance by ~10%). – MrMobster Mar 15 '18 at 12:50
  • Then what's the problem with actually using move constructors to move them? Without *hideous* calls to reinterpet_cast or memcpy? – n. m. could be an AI Mar 15 '18 at 12:56
  • @n.m: because a) it would technically allow the API user to move the entry out of the container (mostly cosmetic, but mistakes happen) and b) a move constructor would be trivial anyway and page remapping is faster than copying. Writing a loop that invokes move constructor for millions of objects just to tell the compiler that the object is alive sounds a bit wasteful... – MrMobster Mar 15 '18 at 12:59
  • @MrMobster You are not asking about the language when you deliberately use UB. Telling the compiler the object is alive is not "wasteful", it is _necessary_, as it allows some classes of optimizations – Passer By Mar 15 '18 at 13:05
  • Your edit didn't explain why it isn't a dupe. You can't _"memcpy the storage and not the object instance"_, you always `memcpy` the underlying bytes. If a type is trivially copyable, then you may `memcpy` it, if not, it will eventually be UB. __There is no relocatable in this statement, it is irrelevant as far as the language is concerned__. – Passer By Mar 15 '18 at 13:10
  • Once the users have their hands on your container, it is also technically possible for them to mess with the nodes with `memcpy`. Perhaps some abstraction and information hiding are in order. – n. m. could be an AI Mar 15 '18 at 13:10
  • @PasserBy I am certainly not arguing with that! What I am asking about is whether there is a better way of telling the compiler that the object is alive then invoking the move constructor of millions of objects (if not, its a serious language deficiency IMO). As to trivially copyable... the standard library tells me that it is. If I look at the definition of the TiviallyCopyable, its not (since all constructors and assignment operators are deleted). So at least when clang and libc++ are considered, things are already quite awkward :) – MrMobster Mar 15 '18 at 13:14
  • If it is indeed the case the library traits disagree with the definition, you might want to ask another question, but check carefully first. – Passer By Mar 15 '18 at 13:17

1 Answers1

4

So, as with all of these kinds of questions, objects are only created in four situations:

An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).

This code:

auto storage1 = new Storage();
memcpy(storage1, storage0, sizeof(Storage));

Gives you an object of type Storage at storage1, but there is no object of type Relocatable ever created at that point. Hence, this:

Relocatable& obj = *reinterpret_cast<Relocatable*>(storage1);

is undefined behavior. Period.


In order to define behavior for that, we need a fifth mechanism to create an object, such as what is proposed in P0593:

We propose that at minimum the following operations be specified as implicitly creating objects: [...]

  • A call to memmove behaves as if it

    1. copies the source storage to a temporary area

    2. implicitly creates objects in the destination storage, and then

    3. copies the temporary storage to the destination storage.

    This permits memmove to preserve the types of trivially-copyable objects, or to be used to reinterpret a byte representation of one object as that of another object.

  • A call to memcpy behaves the same as a call to memmove except that it introduces an overlap restriction between the source and destination.

This proposal (or something like it) would be necessary your code well-formed.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • Great, exactly the kind of information I have been looking for! For now, I am implementing two code versions, one using move constructors and another assuming that realloc does initialisation (which seems to be the case with all major compilers as of now). Of course, one can argue that its all moot, since a loops involving move/copy constructors are optimised to memcpy anyway. However, there are scenarios where you can do virtual memory remapping, which can grow containers without any memory copy . I think what C++ needs is a collection of "unsafe" barriers (std::launder is a first step). – MrMobster Mar 18 '18 at 13:48