11

This question followes this one

Let's consider this example code:

struct sso
  {
  union{
    struct {
      char* ptr;
      char size_r[8];
      } large_str;
    char short_str[16];
    };

  bool is_short_str() const{
    return *std::launder(short_str+15)=='\0'; //UB?
    }
  };

If short_str is not the active member dereferencing the pointer without std::launder would be UB. Let's consider that the ABI is well specified and that we know that size_r[7] is at the same address as short_str[15]. Does std::launder(short_str+15) return a pointer to size_r[7] when short_str is not the active member of the union?


Nota: I think this is the case because [ptr.launder]/3

A byte of storage is reachable through a pointer value that points to an object Y if it is within the storage occupied by Y, an object that is pointer-interconvertible with Y, or the immediately-enclosing array object if Y is an array element.

timrau
  • 22,578
  • 4
  • 51
  • 64
Oliv
  • 17,610
  • 1
  • 29
  • 72
  • @AndyG Because the storage associated to short_str is [reachable](https://timsong-cpp.github.io/cppwp/n4659/ptr.launder#3) through the storage associated to `size_r` and that `size_r` is within its period of lifetime. I am almost sure of me. Unfortunatly I am oftenly wrong to be sure! – Oliv Jan 10 '18 at 14:21
  • @Oliv: Indeed `large_str` and `short_str` have a common initial sequence, But initializing all of `large_str` will only initialize 9 bytes, so any access to the 10th byte and beyond of `short_str` is outside of the common initial sequence, and I was under the impression that this entered UB territory (though I'm also often wrong, too :-)) – AndyG Jan 10 '18 at 14:39
  • I think the standard snippet you referenced is primarily aimed at usage of placement new in an `aligned_storage` – AndyG Jan 10 '18 at 14:42
  • @AndyG, actualy I find that this use case fit well with the intent of std::launder exposed by the C++ standard editor (R.Smith) in https://groups.google.com/a/isocpp.org/forum/#!topic/std-discussion/ko5ceM4szIE – Oliv Jan 10 '18 at 15:36
  • @AndyG: Why would initializing all of `large_str` only initialize 9 bytes? It's a struct with a `char*` followed by `char[8]`; the pointer would be 4-8 bytes (depending on architecture), the `char[8]` eight bytes, so full initialization should initialize 12-16 bytes. – ShadowRanger Jan 10 '18 at 15:42
  • @ShadowRanger: Crap, you're right. I miscalculated the size of `large_str`. – AndyG Jan 10 '18 at 15:43
  • 1
    @AndyG, The rule about common initialization sequence does not apply here because it is "descending" (the first member must match, then the second,...) and do not apply to arrays. See in [\[class\]/20](https://timsong-cpp.github.io/cppwp/n4659/class.mem#20). – Oliv Jan 10 '18 at 15:52
  • 1
    @AndyG Actualy active realy mean within its lifetime: [\[class.union\]](https://timsong-cpp.github.io/cppwp/n4659/class.union#1) *In a union, a non-static data member is active if its name refers to an object whose lifetime has begun and has not ended.* active do not mean more than "the living object of a living union" – Oliv Jan 10 '18 at 15:55
  • @Oliv: So it's probably undefined behavior because you're not inspecting a common initial sequence when you access `short_str` if `long_str` is active. – AndyG Jan 10 '18 at 16:49
  • @AndyG Without `std::launder` this is indeed undefined behavior. But with `std::launder` since `long_str` is within its lifetime and all the storage `long_str` is accessible throw the storage of `short_str` (which is the array of short_str[0]) so it should not be UB. This seems to match exactly the definition of `std::launder` given in the standard. I believe that `launder(&short_str)` return a pointer to `large_str` if it is the active member. (In practice, at assembly level, I am sure this is exactly what GCC is going to generate) – Oliv Jan 10 '18 at 16:55
  • @Oliv: I think I follow what you're saying, and your logic makes sense. In my mind, whether the code is valid hinges on whether `short_str` is alive (within its lifetime), and since it does not share a common initial sequence with `long_str`, then if `long_str` is the active member, I would argue that `short_str` is not within its lifetime. – AndyG Jan 10 '18 at 17:22
  • @AndyG So basicaly, you are saying that std::launder does not change anything. I'am expecting an explanation of why this would be the case. – Oliv Jan 10 '18 at 18:22
  • @Oliv: I don't believe myself to be versed well enough in std::launder yet to provide a definitive answer. It's my speculation that this is undefined behavior because you're accessing the non-active member of a union that doesn't share a common initialization sequence with the active member. std::launder is good for informing the compiler that we have a different object in place than before and does not implicitly activate otherwise-inactive members of a union. – AndyG Jan 10 '18 at 18:38
  • 1
    @AndyG Activeness implies whether an object is within its lifetime. After staring at [basic.life] for some time, the snippet doesn't seem to violate anything there. – Passer By Jan 11 '18 at 08:10
  • @PasserBy: But only one member of a union can be active at a time, and OP is saying that `short_str` is not active. – AndyG Jan 11 '18 at 12:30
  • @AndyG, This is why I use std::launder: to get a pointer to a living object from a pointer to a dead object. – Oliv Jan 11 '18 at 13:26
  • 1
    @Oliv: I don't think that `std::launder` can implicitly activate a member of a union. – AndyG Jan 11 '18 at 13:38
  • 1
    @AndyG Me too, the point is **not to activate** a member union... This is about getting **access** to an already activated object. You are probably misconceiving C++17 pointers, they have actualy drastically changed the meaning of pointers in C++17. See this [Q&A](https://stackoverflow.com/questions/48062346/is-a-pointer-with-the-right-address-and-type-still-always-a-valid-pointer-since). ACtualy `std::launder` can be used to get back some of the pre C++17 pointer semantic, but in a somehow restricted fashion. – Oliv Jan 11 '18 at 16:53
  • @AndyG Before C++17, it would have been just fine without std::launder. Beause the pointer short_ptr+15 has the right address and type it can be used to access size_r[7]. But since C++17 this is not the case anymore. Now compiler actualy track to which object is actualy pointing a pointer, no matter if at the pointed address resides an other object. To change the object pointed to by the pointer (even if this does not change the pointer value and type) one should use std::launder. – Oliv Jan 11 '18 at 17:04
  • @Oliv Why can't we just `return *(reinterpret_cast(this) + 15) == '\0';` ? Is the logic behind `bool is_short_str() const` only to check out a byte representation at certain position? – sandthorn Aug 08 '18 at 17:40
  • @sandthorn Indeed, this is not UB. My question aimed at understanding std::launder, my intent was not to find a why to implement short string optimization without UB. – Oliv Aug 09 '18 at 08:08

1 Answers1

3

Let's consider that the ABI is well specified and that we know that size_r[7] is at the same address as short_str[15]

It depends entirely on what that guarantee means exactly.

A compiler is free to guarantee that

Sso.short_str[15]

can be accessed and modified and everything even when Sso.large_str is currently active, and get exactly the semantics you expect.

Or it is free not to give that guarantee.

There is no restriction on the behavior or programs that are ill-formed or exhibit undefined behavior.

As there is no object there, &Sso.short_str[15] isn't pointer-interconvertible with anything. An object that isn't there doesn't have the "same address" as another object.

Launder is defined in terms of a pointer to a pre-existing object. That pointer is then destroyed, and a new object with the same address is created (which is well defined). std::launder then lets you take the pointer to the object that no longer exists and get a pointer to the existing object.

What you are doing is not that. If you took &short_str[15] when it was engaged, you'd have a pointer to an object. And the ABI could say that this was at the same address as size_r[7]. And now std::launder would be in the domain of validity.

But the compiler could just go a step further and define that short_str[15] refers to the same object as size_r[7] even if it isn't active.

The weakest ABI guarantee that I could see being consistent with your stuff would only work if you took the address of short_str[15] when it was active; later, you would engage the large_str, and then you could launder from &short_str[15] to &size_r[7]. The strongest ABI guarantee that is consistent with your statement makes the call to std::launder not required. Somewhere in the middle std::launder would be required.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • I need an argumentation for the paragraph starting with "As there is not object,..." because [basic.types] does not say that a pointer must be generated when the object is in its life time period to be *pointer inter-convertible*. – Oliv Sep 10 '18 at 20:51