0

I could not find a lot of resources like this that cover all of the edge cases of the strict aliasing rule. If I understand correctly, in C++ it is UB to access an object that does not exist at a particular location (for example cast of uint32_t* to float* and access is illegal since there is no float object alive at that address), and the only way to start a lifespan of such an object is using placement new and a simple assignment does not suffice. This brings me to my question:

// CASE 1
uint64_t arr[8] = {}; //dynamic type of each element now uint64_t?

for(int ii = 0; ii < 8; ++ii){
    new (arr+ii) double(3.14); // dynamic type now double?
}

arr[0] = 10; // dereference as uint64_t while 
             //dynamic type is double? so this violates strict aliasing I assume?


// CASE 2
uint64_t* arr = new uint64_t[8]; //dynamic type of each element now uint64_t?

// CASE 3
uint64_t* mal_arr = (uint64_t*)malloc(8*sizeof(uint64_t)); // dynamic type now void? 

mal_arr[0] = 10; // if it is indeed void, this must be illegal, correct?
                                  
arc31
  • 127
  • 1
  • 5
  • 1
    _"dynamic type of each element now uint64_t?"_ - what do you mean by "dynamic type"? Each element _is_ an `uint64_t`. It's not dynamic. – Ted Lyngmo Jul 03 '23 at 21:49
  • @tadman placement new does not allocate anything... – arc31 Jul 03 '23 at 21:54
  • @arc31 Your use here is extremely confusing, especially considering what is going on in the rest of this code with all kinds of utterly bizarre things happening. – tadman Jul 03 '23 at 21:54
  • I'm aware that types don't change. I just couldn't find a better word. The type might not change but what the compiler thinks is contained at a memory location however does. – arc31 Jul 03 '23 at 21:56
  • @tadman I don't get what your point here is? The purpose of placement new is type punning. A statement like "The compiler does as instructed" is just wrong when you have UB. – arc31 Jul 03 '23 at 22:02
  • @tadman Umm no. Its not "going to do it": It can do whatever it pleases. Also reinterpret_cast does not type pun. That would be bit_cast. Something like: int var = 0; float f = *reinterpret_cast(&var) is UB – arc31 Jul 03 '23 at 22:07
  • @tadman *"reinterpret_cast because that will fail if it can't do it"* No? It's the same UB either way, it won't give you any extra safety. – HolyBlackCat Jul 03 '23 at 22:08
  • @tadman Will it? I think it only refuses to strip cv-qualifiers. – HolyBlackCat Jul 03 '23 at 22:15
  • @tadman What do you mean by "losing the allocation"? No memory is deallocated. – arc31 Jul 03 '23 at 22:16
  • @tadman What are you talking about? Why in the world would case 1 leak memory? The array is automatically allocated. Once the scope/function is left the stack pointer is simply decremented. I think you are confusing placement new with new. Placement new is simply a call to the constructor, no more. – arc31 Jul 03 '23 at 22:23
  • @tadman Its fine if you find a few lines of code that confusing. But the point of the question is to understand the standard. The question has nothing to do with reinterpret_cast – arc31 Jul 03 '23 at 22:42
  • @tadman OP's terminology is correct, I think. "Dynamic type" isn't only used for polymorphic types. (Though for other types there's no way to determine it from just looking at the memory, similarly to how it's impossible to tell if an object lifetime has ended from looking at the memory.) – HolyBlackCat Jul 03 '23 at 23:07
  • @tadman No, I think its you who has to start from the ground up. You say things like "overwriting an automatically allocated array eill cause a memory leak" and pretend to have any sort of expertise. I made it clear to you multiple times that I know types dont change. I used the word dynamic as a synonym to the C concept of "effective type". I'm not new to C++ by any means. And clearly the people actually answering the question know what the question means. Yet there is disagreement among them on what is UB and what is not. So I think the question is perfectly valid as is. – arc31 Jul 03 '23 at 23:13
  • All I see is a bunch of downvoted and deleted answers, so this question is bringing nothing but conflict and misery. I'm out. – tadman Jul 03 '23 at 23:33

1 Answers1

3

Case 1:

Firstly, I'll assume that alignof(double) >= alignof(uint64_t). Otherwise the very first placement-new is UB.

Next, from what I can tell, arr+ii on the second iteration is UB, because + can only move a pointer between elements of the same array, and the lifetime of the array ends once you placement-new the first double on top of it. But this is mostly a theoretical issue, + should work correctly in practice.

Then arr[0] = 10; is an outright strict aliasing violation, causing UB.

Case 3:

This used to be UB (placement-new is missing), but C++20 gave us implicit-lifetime types, which can be created implicitly by malloc and some other methods.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • 1
    https://eel.is/c++draft/basic#life-1.5 says "_The lifetime of an object o of type T ends when_": It doesn't say that it stops existing and I thought pointer arithmetic was fine even outside the lifetime. But that's (part of) the language lawyering part I mentioned in my comment under the other answer. – user17732522 Jul 03 '23 at 22:25
  • @user17732522 By "stops existing" I meant "lifetime ends". I've fixed the wording. *"arithmetic was fine"* I'll look it up. – HolyBlackCat Jul 03 '23 at 22:26
  • @user17732522 [`[basic.life]/6`](http://eel.is/c++draft/basic.life#6) says you can only use the pointer *"as if the pointer were of type `void*`"*, which sounds like no arithmetic. Also the wording for [`+`](http://eel.is/c++draft/expr.add#4.2) sounds like it only applies to alive arrays. – HolyBlackCat Jul 03 '23 at 22:32
  • I don't think that works out. I think "_and using the pointer as if the pointer were of type void* is well-defined_" doesn't mean that that's _exclusively_ how it can be used. None of the explicitly listed UB cases apply. If your interpretation was correct, then you couldn't do arithmetic on an array providing storage either, because the lifetime of the array _elements_ is still ended by the storage reuse and so `arr` in `arr + ii` after array-to-pointer decay results in a pointer to the array _element_ outside its lifetime regardless. – user17732522 Jul 03 '23 at 22:43
  • @user17732522 I stand corrected, `[basic.life]/6` doesn't seem to ban it. But I still think that `[expr.add]/4.2` doesn't allow it. *"lifetime of the array elements is still ended"* Ha, this is clever! I think the intent is that not only the array survives providing storage for an other object, but also the individual chars do. (Perhaps because of http://eel.is/c++draft/basic.types.general#4) – HolyBlackCat Jul 03 '23 at 22:54
  • I think ending the lifetime is intended. If it didn't, then reading and writing to it would still be allowed and there isn't really anything in the standard describing how access to such an element object should work, i.e. how it correlates with the value of a nested object. It also doesn't say anywhere that it is related to the object representation (but even if it was, the same issue would apply). – user17732522 Jul 03 '23 at 22:57
  • @HolyBlackCat Are arrays treated as one single object though? If not, then the lifetime of the entire array wouldn't end just because an element was overwritten – arc31 Jul 04 '23 at 00:16
  • @arc31 They are single objects. – HolyBlackCat Jul 04 '23 at 05:46
  • @HolyBlackCat Ok I think I understand now, thanks. As a related question do you have any idea on how the following could be done? `nonImplicit_t* arr = (nonImplicit_t*)malloc(sizeof(nonImplicit_t)*10)` and then use placement new to construct in place `for(int ii = 0; ii < 10; ++ii){ new(arr+ii) nonImplicit_t();}` and then not have to use launder every time an array member is accessed? I mean since the original pointer may not be used to access the members(only return value of placement new) – arc31 Jul 04 '23 at 18:24
  • @arc31 You have to launder on every access. Why you don't want to? – HolyBlackCat Jul 04 '23 at 18:41
  • @HolyBlackCat Was just wondering whether the language has a less verbose way of doing something that's relatively common (first allocate array and then construct as needed and then be able to use the array as normal). `std::launder(arr+5)->foo()` is annoying to write out compared to `arr[5].foo()` – arc31 Jul 04 '23 at 18:41
  • @arc31 You should have a wrapper class around this anyway, with RAII to automatically free memory and so on. The wrapper can `std::launder` as needed. Or you could have `std::vector>` with slightly more overhead, but everything working out of the box. – HolyBlackCat Jul 04 '23 at 18:42
  • @HolyBlackCat: Should programming languages exist to serve the needs of programmers, or should programming languages be designed to make programmers jump through gratuitous hoops? Is there any reason a non-garbage dialect of C++ shouldn't provide a means for a programmer to launder the pointer to a sequence of objects once and be done with it? – supercat Jul 05 '23 at 16:41
  • @supercat Presumably "launder once" would interfere with strict aliasing optimizations, or we would already have it. As for hoops, I don't mind them as much when they reside in low-level library code. OP is trying to write low-level library code (perhaps unknowingly). An application-level programmer would use `std::vector>`. – HolyBlackCat Jul 05 '23 at 17:03
  • @HolyBlackCat: Which impedes optimization more: providing a launder-once facility, or having programmers use `-fno-strict-aliasing` because there's no other reliable way to accomplish what needs to be done? The only compilers that wouldn't be able to process a "launder once" facility are those that have never processed the current rules correctly anyhow (except when using `-fno-strict-aliasing`) – supercat Jul 05 '23 at 17:26
  • @HolyBlackCat: If use of a certain directive would guarantee correct semantics, but at the cost of reduced execution efficiency, who should be expected to be best able to judge in any particular situation whether the cost of the efficiency would be more or less acceptable than the cost of reworking the program to avoid needing the directive: the Commitee, a compiler writer, or a programmer who is using the compiler? Is there any reason people who oppose such directives because they would "reduce efficiency" shouldn't be recognized as being guilty of premature optimization? – supercat Jul 05 '23 at 17:44
  • @supercat Ok, [here's an example](https://en.cppreference.com/w/cpp/utility/launder#Example) where `std::launder` changes the generated assembly ([godbolt](https://gcc.godbolt.org/z/evc1oe7Yd)). (I would expect those to be easier to come by, but didn't find anything else. Anyhow...) Note that even `-fno-strict-aliasing` doesn't fix it. Would you propose that we shouldn't devirtualize calls on local variables after their address is passed to some opaque function? – HolyBlackCat Jul 05 '23 at 18:31
  • @HolyBlackCat: Non-blittable/non-trivial/standard-layout objects are not subject to implicit creation and destruction, but instead must be created and destroyed at specifically-defined times. What I'm arguing against is the notion that objects that can be implicitly created and destroyed should be viewed as having a lifetime distinct from the storage containing them, as opposed to recognizing that any storage which is owned by a program and isn't used by non-trivial objects should be viewed as simultaneously holding objects of all types that will fit, even if not all are always accessible. – supercat Jul 05 '23 at 20:15
  • @supercat Yeah, I can agree with that. Not sure if it can be problematic in practice or not, I wonder if any compilers can actually break that while optimizing. – HolyBlackCat Jul 05 '23 at 20:19
  • @HolyBlackCat I'm aware that this is low level, but I like the C style sometimes. Its still bewildering that before C++20 the standard was so hostile to completely ordinary C code(malloc and assign). – arc31 Jul 07 '23 at 18:46
  • @arc31 Mhm, arguably it's still broken in many ways regarding low-level memory manipulation. (As in, doesn't match what compilers are actually enforcing.) *"I like the C style"* This will pass. Either once you prove to yourself that you can do it, and/or once you get one of those very difficult to debug memory bugs. Or once you realize you have 10 colleagues to babysit, incapable of writing or maintaining such code. – HolyBlackCat Jul 07 '23 at 18:51
  • @HolyBlackCat I mean, yeah it would be stupid to write such code at work. But for personal projects, I keep coming back to the C way of things especially for data structure/algorithm manipulation. – arc31 Jul 07 '23 at 19:06