11

I am exploring the possibility of implementing true (partially) immutable data structures in C++. As C++ does not seem to distinguish between a variable and the object that variable stores, the only way to truly replace the object (without assignment operation!) is to use placement new:

auto var = Immutable(state0);
// the following is illegal as it requires assignment to
// an immutable object
var = Immutable(state1);
// however, the following would work as it constructs a new object
// in place of the old one
new (&var) Immutable(state1);

Assuming that there is no non-trivial destructor to run, is this legal in C++ or should I expect undefined behaviour? If its standard-dependant, which is the minimal/maximal standard version where I can expect this to work?

Addendum: since it seems people still read this in 2019, a quick note — this pattern is actually legally possible in modern (post 17) C++ using std::launder().

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
MrMobster
  • 1,851
  • 16
  • 25
  • Can you clarify what you mean by "distinguish between a variable and the object that variable stores"? Because objects are certainly distinct from variables in general (e.g. objects that don't correspond to variables, or indirection via pointers). – Oliver Charlesworth Mar 24 '17 at 10:48
  • 1
    Why do you want to reassign to `var`? Why not have it as a pointer, which can be set to the address of different instances of `Immutable` as needed? – Peter Mar 24 '17 at 10:48
  • 3
    If you need to change it, it's not immutable. – Pete Becker Mar 24 '17 at 10:50
  • @PeteBecker this is not correct. There is a difference between mutability of the variable (binding) and mutability of the object itself. It is often desirable, of considerations of safety and efficiency, to prevent mutability of the object, but still allow the variable itself to be rebound to a different object. Note that I am not mutating the object, I am completely discarding the old object and binding a new object. – MrMobster Mar 24 '17 at 11:35
  • 1
    C++ reference semantics are pointers or references. Variables are values. And how you implement immutability matters to answer. – Yakk - Adam Nevraumont Mar 24 '17 at 13:32
  • @MrMobster If you use the same variable name, it's the same object. – curiousguy Mar 25 '17 at 23:24

3 Answers3

12

What you wrote is technically legal but almost certainly useless.

Suppose

struct Immutable {
  const int x;
  Immutable(int val):x(val) {}
};

for our really simple immutable type.

auto var = Immutable(0);
::new (&var) Immutable(1);

this is perfectly legal.

And useless, because you cannot use var to refer to the state of the Immutable(1) you stored within it after the placement new. Any such access is undefined behavior.

You can do this:

auto var = Immutable(0);
auto* pvar1 = ::new (&var) Immutable(1);

and access to *pvar1 is legal. You can even do:

auto var = Immutable(0);
auto& var1 = *(::new (&var) Immutable(1));

but under no circumstance may you ever refer to var after you placement new'd over it.

Actual const data in C++ is a promise to the compiler that you'll never, ever change the value. This is in comparison to references to const or pointers to const, which is just a suggestion that you won't modify the data.

Members of structures declared const are "actually const". The compiler will presume they are never modified, and won't bother to prove it.

You creating a new instance in the spot where an old one was in effect violates this assumption.

You are permitted to do this, but you cannot use the old names or pointers to refer to it. C++ lets you shoot yourself in the foot. Go right ahead, we dare you.

This is why this technique is legal, but almost completely useless. A good optimizer with static single assignment already knows that you would stop using var at that point, and creating

auto var1 = Immutable(1);

it could very well reuse the storage.


Caling placement new on top of another variable is usually defined behaviour. It is usually a bad idea, and it is fragile.

Doing so ends the lifetime of the old object without calling the destructor. References and pointers to and the name of the old object refer to the new one if some specific assumptions hold (exact same type, no const problems).

Modifying data declared const, or a class containing const fields, results in undefined behaviour at the drop of a pin. This includes ending the lifetime of an automatic storage field declared const and creating a new object at that location. The old names and pointers and references are not safe to use.

[Basic.life 3.8]/8:

If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:

  • (8.1) the storage for the new object exactly overlays the storage location which the original object occupied, and

  • (8.2) the new object is of the same type as the original object (ignoring the top-level cv-qualifiers), and

  • (8.3) the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type, and

  • (8.4) the original object was a most derived object (1.8) of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).

In short, if your immutability is encoded via const members, using the old name or pointers to the old content is undefined behavior.

You may use the return value of placement new to refer to the new object, and nothing else.


Exception possibilities make it extremely difficult to prevent code that exdcutes undefined behaviour or has to summarially exit.

If you want reference semantics, either use a smart pointer to a const object or an optional const object. Both handle object lifetime. The first requires heap allocation but permits move (and possibly shared references), the second permits automatic storage. Both move manual object lifetime management out of business logic. Now, both are nullable, but avoiding that robustly is difficult doing it manually anyhow.

Also consider copy on write pointers that permit logically const data with mutation for efficiency purposes.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • Fully agree with your first paragraph. But given my particular use case (this is for implementing library-internal fat iterator/accessor pattern that can abstract underlaying storage), using object replacement of this sort greatly simplifies the semantics and actually makes the code more safe. In particular, iterators cannot be implicitly copied and can only be moved under very restricted set of conditions. No reference to an iterator can exist. This all ensures that the compiler can use aggressive optimisations (and I see improvements of 10+% after moving to this pattern). – MrMobster Mar 24 '17 at 14:01
  • 1
    @MrMobster Except by having members that are `const` you are almost certainly doing undefined behavior via your "create a new object" plan. The optimizations you are seeing is the compiler assuming that const data **never changes**; you are changing it, violating those assumptions. Standard quote added. That is the only clause that lets you use the name of the variable you created a new value in after you destroy the old one. Like I said, bad idea. – Yakk - Adam Nevraumont Mar 24 '17 at 14:34
  • @MrMobster I have explicitly and directly addressed what I suspect your use case is, and why your use case leads to undefined behavior while your question in the OP does not. – Yakk - Adam Nevraumont Mar 24 '17 at 14:44
  • Ok, thanks for the clarification! If I understand it correctly, its the point 8.3 that makes what I want to use illegal. So there is no way to have mutable bindings with immutable objects after all in C++. Its a shame, but I can live with it. – MrMobster Mar 24 '17 at 14:53
  • the rules [Basic.life 3.8]/8 you quoted have changed now in C++, so it is fine to use the old pointer even if the members are const/reference. – minex Jul 20 '23 at 06:58
  • @minex Ah, and see https://stackoverflow.com/questions/59298904/undead-objects-basic-life-8-why-is-reference-rebinding-and-const-modificat for discussion about why and when it happened. – Yakk - Adam Nevraumont Jul 20 '23 at 13:25
2

From the C++ standard draft N4296:

3.8 Object lifetime
[...]
The lifetime of an object of type T ends when:
(1.3) — if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
(1.4) — the storage which the object occupies is reused or released.
[...]
4 A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor. For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.

So yes, you can end the lifetime of an object by reusing its memory, even of one with non-trivial destructor, as long as you don't depend on the side effects of the destructor call.

This applies when you have non-const instances of objects like struct ImmutableBounds { const void* start; const void* end; }

alain
  • 11,939
  • 2
  • 31
  • 51
  • Modifying things which are actually const has specific rules – Yakk - Adam Nevraumont Mar 24 '17 at 13:29
  • Thanks for your comment, @Yakk. I interpreted OP's immutable objects as being objects with const members like he [commented here](http://stackoverflow.com/questions/42997440/is-it-legal-to-use-placement-new-on-initialised-memory/42997720?noredirect=1#comment73087087_42997785). For these I think the normal rules for object lifetime apply. Of course for a `const T t` this doesn't work, but I think the OP understands this and is not asking about it. – alain Mar 24 '17 at 13:43
  • @alain No, the member variables being `const` doesn't make it legal to use the variable name again. Well, the OP did nothing wrong, except they can never ever mention `var` in their code after the call to `new`, which seems like the next thing they'd want to do. – Yakk - Adam Nevraumont Mar 24 '17 at 14:36
  • 1
    Yep, somehow I missed that reusing `var` was the point and focused only on "reusing initialized storage". You answer the "broader" question very nicely, and it's good that the OP accepted your answer :-) – alain Mar 24 '17 at 15:45
0

You've actually asked 3 different questions :)

1. The contract of immutability

It's just that - a contract, not a language construct.

In Java for instance, instances of String class are immutable. But that means that all methods of the class have been designed to return new instances of class rather than modifying the instance.

So if you would like to make Java's String into a mutable object, you couldn't, without having access to its source code.

Same applies to classes written in C++, or any other language. You have an option to create a wrapper (or use a Proxy pattern), but that's it.

2. Using placement constructor and allocating into an initialized are off memory.

That's actually what they were created to do in the first place. The most common use case for the placement constructor are memory pools - you preallocate a large memory buffer, and then you allocate your stuff into it.

So yes - it is legal, and nobody won't mind.

3. Overwriting class instance's contents using a placement allocator.

Don't do that.

There's a special construct that handles this type of operation, and it's called a copy constructor.

Piotr Trochim
  • 693
  • 5
  • 15
  • 1. `const` is a language construct. It is another issue that in C++ immutability is usually delivered by a contract (i.e. using a mutable interior object with only giving user access to exterior immutable-like wrapper API). 2. I think you are confusing initialised with allocated. In your example the memory has been allocated but not initialised. 3. How can a copy constructor solve my problem? It would allow me to copy an object, but not reassign an object — that is where one needs assignments operators. And assignment operator is what I want to avoid explicitly. – MrMobster Mar 24 '17 at 11:39
  • 1. const applies to a method or a variable - not an instance of an object. And it establishes and helps to enforce the immutability contract - but it's still just a contract. 2 - you don't need to initialize the memory you're allocating into, it can contain garbage. You need to allocate it though. 3 - I was under impression you were attempting to change an area in memory where an instance resides. instead of overriding it with a new object instance using placement-new, why not simply copy the values ( you have more control over what you can do then ) – Piotr Trochim Mar 24 '17 at 11:42
  • `struct ImmutableBounds { const void* start; const void* end; }` How would you mutate this? But maybe we simply have different understanding of what falls under the notion of contract. – MrMobster Mar 24 '17 at 11:48
  • 1
    Now I see what you're getting at - would be good to add that definition to the question. If the creator marked those fields as 'const', he had a good reason not to have them changed. So the real question is - why do you want them mutated? That's the contract I was referring to. And without stripping those 'conts' off the language won't let you modify their value. – Piotr Trochim Mar 24 '17 at 11:49
  • Piotr, thats is exactly the point: I do not want them mutated! And that it what I meant with C++ not properly distinguishing between the variable and the object. I do not want to mutate the object itself, but I do want to bind a different object to the variable. Looking at the `ImmutableBounds` structure, imagine that you have an algorithm that needs to make a selection of which bounds to use (and the only way to do it is iteratively). You don't actually want to change any of the bounds. Traditionally, this is a use case for pointers, only that I want to avoid reference semantics. – MrMobster Mar 24 '17 at 12:43
  • P.S. There are languages that do clearly distinguish variable-level and object-level mutability, e.g. Swift (and I think Rust as well, bit I don't know Rust well enough). – MrMobster Mar 24 '17 at 12:46
  • Good that you mentioned that you considered using pointers - that would be my choice here. I'm under impression that your problem is really with syntax - having to use the */& characters for pointers/references. In C++ that's the way it is - type definitions are really strict, and you need to specify if something's a pointer or not. I don't know Swift nor Rust , but for instance in c# or python, everything's an object, and therefore the language saves you the trouble of extra syntax. In C++ - you have to live with it. – Piotr Trochim Mar 24 '17 at 12:53
  • "_but I do want to bind a different object to the variable_" then you probably want to program in a language that is not C or C++: they have value semantic – curiousguy Mar 25 '17 at 23:43