19

There seems to be some agreement that you can't willy nilly point (an int*) into a char array because of the C++ aliasing rules.

From this other question -- Generic char[] based storage and avoiding strict-aliasing related UB -- it seems that it is allowed to (re-)use storage through placement new.

alignas(int) char buf[sizeof(int)];

void f() {
  // turn the memory into an int: (??) from the POV of the abstract machine!
  ::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)

  // access storage:
  *((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value
}

So, is the above legal C++ and is the placement new actually needed to make it legal?

Community
  • 1
  • 1
Martin Ba
  • 37,187
  • 33
  • 183
  • 337
  • Related: http://stackoverflow.com/questions/38862092/is-it-legal-to-alias-a-char-array-through-a-pointer-to-int – Martin Ba Jan 12 '17 at 23:09
  • https://godbolt.org/g/k2nVI9 – Martin Ba Jan 12 '17 at 23:10
  • Highly relevant, potential dupe: https://stackoverflow.com/questions/40873520/reinterpret-cast-creating-a-trivially-default-constructible-object – Baum mit Augen Jan 12 '17 at 23:14
  • suggest changing the wording of the title to make it clear that you mean accessing the char array via an `int` lvalue. You can put an int into a char array by other means, e.g. `memcpy` it in and out. – M.M Jan 13 '17 at 02:20
  • In C (not sure in C++) a pointer to char is excluded from aliasing rules (i.e. it's allowed to alias). To work this in reverse, you normally use a type-punning union with int and char members. Of course, the remaining issue is the alignment, but I assume the underlying architecture allows unaligned access and/or the pointer is properly aligned. – Alexandre Pereira Nunes Jan 26 '17 at 10:42

3 Answers3

14

Yes, the placement new is necessary, otherwise you'd violate strict aliasing (assignment is access).

Is the above legal? Almost (although it will work on virtually all implementations). The pointer you've created through the cast does not point to the object, because the (now destroyed) array and the int object are not pointer-interconvertible; use std::launder((int*)buf), or better yet, use the placement new's return value.

Columbo
  • 60,038
  • 8
  • 155
  • 203
  • 1
    Strict aliasing isn't violated by `reinterpret_cast(buf) = 42` because it accesses the new `int` object as an lvalue of type `int`. Pointer-interconvertible is applicable when converting one "live" pointer to another, not when converting a "dead" pointer-to-non-object to a live one. Edit: I'm not sure whether `launder` is needed there; the rules were still in flux last I knew. – Potatoswatter Jan 13 '17 at 11:38
  • 1
    @Potatoswatter Which new int object? I was talking about the case in which there was no prior placement new to create any such object. And I read pointer-interconvertability as applicable in general when reinterpreting pointers. Regardless, the pointer created by the op using the cast does not point to the new object as defined by P0137, so `launder` should be necessary. – Columbo Jan 13 '17 at 13:11
  • [basic.lval] forbids accessing an object except by lvalues of certain types. It doesn't address the question of whether an object yet exists. [basic.life]/6 in C++14 allows assignment through `reinterpret_cast` to start a lifetime; this is removed by P0137 so C++17 will (likely) make placement-new a requirement. You're right about pointer interconvertibility; P0137 connects it to `reinterpret_cast` via `static_cast` from `void*`. So `launder` is necessary to access the live object (but never `launder` storage to a non-object). – Potatoswatter Jan 13 '17 at 14:25
  • 1
    @Potatoswatter I don't think that's right. Assignment never created objects, except when assigning to inactive union members, no? – Columbo Jan 13 '17 at 15:01
  • C++14 [basic.life]/6: "such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined." Then, lvalue-to-rvalue conversion is forbidden but write access is not. [basic.life]/1 said that obtaining storage creates an object, which implied that every suitably-aligned memory location was a potentially live object of every type. This is one hole that P0137 is patching. – Potatoswatter Jan 13 '17 at 15:53
  • \* every *trivially-constructible* type – Potatoswatter Jan 13 '17 at 15:58
  • Where is the requirement that the objects are pointer-interconvertible? – curiousguy Jan 13 '17 at 17:23
  • 1
    @curiousguy The code obtains a pointer through `reinterpret_cast` (indirectly via the C-style cast). This pointer is only valid if the corresponding pointees are pointer-interconvertible, as elucidated by the cited paragraph. – Columbo Jan 14 '17 at 14:50
  • 1
    @curiousguy Your comments seem pretty naive. Optimization works by imposing certain restrictions on the language (which translate to assumptions for an optimizer). If the optimizer has a simple set of rules to determine which objects could be accessed by a routine and which can't be, it can perform [alias optimizations](http://www.compileroptimizations.com/category/alias_optimization_address.htm). – Columbo Jan 14 '17 at 14:56
  • @Columbo Actually, your comment is naive. C/C++ exist. Compilers can't have fun breaking code that uses casts. The langage is absolutely not specified, relying on vague wording that doesn't help. Please see my pointers questions http://stackoverflow.com/q/32100245/963864 – curiousguy Jan 14 '17 at 15:00
  • @curiousguy That cited paragraph has been deleted for a reason. The language must be designed with respect to its practical implementation and usage (in this case with a strong focus on optimization), not vice versa. The code that has been broken through this change exploits pointer arithmetic in an unsound way (I have seen discussions about whether or not past-the-end pointers can be used as pointers to the next array's element as far back as last decade). – Columbo Jan 14 '17 at 15:09
  • @Columbo So you are saying that C++ decided to disallow any low level pointer arithmetic. This is craziness. They never had a mandate to break that. – curiousguy Jan 14 '17 at 15:11
  • @curiousguy "They" are compiler developers that act in your interest: making *sound* code run faster. I personally care little about code that employs weird pointer hacks, because it's probably broken in the first place. – Columbo Jan 14 '17 at 15:12
  • @Columbo Can you please describe their "weird" hacks. I don't think so. Compilers shouldn't break valid code. "_If the optimizer has a simple set of rules to determine which objects could be accessed by a routine and which can't be, it can perform alias optimizations._" Well of course, these are distinct variables! – curiousguy Jan 14 '17 at 15:15
  • @Columbo: Compilers are used for a variety of tasks, which require doing different things. An optimization that assumes a program won't do X will be useful for tasks that don't involve doing X, but at best counter-productive for tasks that could be performed most readily by doing X. The fact that the Standard allows implementations intended for tasks that don't involve doing X to assume that programs won't do X doesn't imply any judgment as to whether all implementations should make such assumptions, nor that programs that don't uphold such assumptions are "unsound". – supercat Mar 08 '22 at 17:19
1

This has since changed with the introduction of implicit-lifetime types by P0593R6 (as a defect report, so this applies to all C++ versions).

alignas(int) char buf[sizeof(int)]; starts the lifetime of a char array char[sizeof(int)]. This will also implicitly start the lifetime of an int object that you access in the expression *((int*)buf) = 42.

Since C++17, you also need to launder the pointer: *std::launder((int*)buf) = 42.

Artyer
  • 31,034
  • 3
  • 47
  • 75
  • "*you also need to launder the pointer*" Note that this would not be necessary if you had used `malloc` or `operator new` directly. – Nicol Bolas Feb 01 '23 at 21:33
  • Will clang and gcc ever use an abstraction model that would be capable of correctly handling the concept of objects having implicit lifetimes? – supercat Feb 05 '23 at 03:21
-4
*((int*)buf) = 42;

writes an int with a int lvalue, so there is no aliasing issue in the first place.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • *no aliasing issue in the first place.* ... except for the original array members in static storage which are definitely of type `char`. The strict-aliasing exception that allows you to point `(unsigned char*)` at the bytes of any other object only goes on way; it doesn't make it legal to point an `int*` at objects that are definitely `char`. (It would be fine if this was anonymous memory, e.g. dynamically allocated with `mmap` or something that returned a `void*`, and was only ever accessed through `char*` and `int*`.) – Peter Cordes Dec 31 '21 at 04:17