2
struct header
{
    int a1;
    int a2;
    // ...;

    std::byte * get_data_bytes()
    {
        return align_up<data>( // make sure alignment requirements are met
                    reinterpret_cast<std::byte *>(this) + sizeof(*this)); 
       // maybe std::launder around the reinterpret_cast (only) is needed?
    }

    data & get_data()
    {
        return *std::launder(reinterpret_cast<data *>(get_data_bytes()));
    }

    void use_data()
    {
        get_data().use();
    }
};

void example()
{
    alignas(header) std::byte storage[/* plenty of space*/]; 
    
    auto h = new (storage) header;
    new (h->get_data_bytes()) data;
    
    h->use_data(); // Does this eventually cause a UB?
}

Is this possible without UB? If not, is there an alternative? The requirement is that data is not a subobject of the header and there is no pointer/reference to the data from the header to avoid additional indirection. This could maybe be possible with flexible empty array but I don't think these are in the standard.

  • Can you clarify a little more about the type requirements of `data`? I think that you are dancing around aliasing violations here. – AndyG Oct 14 '21 at 17:33
  • @AndyG I intentionally didn't provide any details of `data`, but it's fine to start the discussion with just a trivial type with a few integers inside. – CppNerd13373 Oct 14 '21 at 17:36
  • It's called "shooting yourself in the foot". Take the starting address of the structure, add in the offset of the item you want to access it, the use C-Style cast to cast the address (pointer) to however you want. Completely undefined behavior. Remember that you may have to account for padding in your offset calculation. – Thomas Matthews Oct 14 '21 at 17:51
  • 2
    @CppNerd13373 AFAIK no, the issue is that in the generic sense you're trying to do the struct hack. The issue is that the compiler can put more padding than the object actually requires via `sizeof` in an array. This means that even though the object has a `sizeof 12` it could be aligned to 16 etc. There is a CPPCon video on it if you're curious – Mgetz Oct 14 '21 at 17:52
  • @ThomasMatthews As much as this example looks like someone would like to shoot themselves in the foot, it is almost everywhere that you have to do things like this when communicating with low level code or hardware. It's very often laid out as a structure with an empty array in the end to be able to access arbitrary data after your header object. If you are sure this is undefined behavior, please can you explain why and provide an alternative? – CppNerd13373 Oct 14 '21 at 17:53
  • I'm not sure about `storage` being `alignas(header)`. Should probably be `alignas(header) alignas(data) std::byte storage[...]` – AndyG Oct 14 '21 at 17:54
  • Does this answer your question? [Are flexible array members valid in C++?](https://stackoverflow.com/questions/4412749/are-flexible-array-members-valid-in-c) – Mgetz Oct 14 '21 at 17:55
  • @Mgetz I think I know this video but would be glad to make sure I am thinking of the exact one. But in any case didn't I align it properly when doing this? – CppNerd13373 Oct 14 '21 at 17:56
  • @CppNerd13373: It's unclear where the UB is supposed to be coming from here. – Nicol Bolas Oct 14 '21 at 17:57
  • @AndyG `alignas(header) alignas(data) std::byte storage[...]` why is it important to `alignas(data)` where storage initially only has to be aligned for `header` when constructing the header at the beginning? The `data` appears after the `header` – CppNerd13373 Oct 14 '21 at 17:57
  • @CppNerd13373 then you'll recall the presenter's conclusion that it's fine for any specific platform you're required to use it on, but shouldn't be considered portable – Mgetz Oct 14 '21 at 17:58
  • @CppNerd13373 If `data` has stricter alignment requirements than `header`, then you won't be able to store a whole number's worth of both `data` and `header` in your storage – AndyG Oct 14 '21 at 17:59
  • @NicolBolas there are a lot of requirements in the standard that I reviewed and was not able to confirm if they are leading to UB here or not. For instance, casting to std::byte pointer in the get_data_bytes is allowed, but is it allowed to also perform the pointer arithmetic? (i.e where is the array here?) – CppNerd13373 Oct 14 '21 at 18:00
  • @AndyG I see, but I would like to focus more on the legality rather than optimization – CppNerd13373 Oct 14 '21 at 18:01
  • @CppNerd13373: "*where is the array here*" Have you considered that `storage` is an array of `std::byte`? – Nicol Bolas Oct 14 '21 at 18:01
  • @CppNerd13373: The alignment requirements are not a matter of optimization, but whether you have UB by stepping off the end of your storage in order to allocate a `data` instance. – AndyG Oct 14 '21 at 18:02
  • @NicolBolas I understand that `storage` is the array that is providing storage for header and data, but when I take the address of `header` by using `this`, I get a pointer of type `header` which is pointing to a non-array object. I am not sure that the cast to std::byte means to the compiler that we are now pointing at the enclosing std::byte array which makes the pointer arithmetic legal. – CppNerd13373 Oct 14 '21 at 18:04
  • @AndyG I assume there is plenty of space in storage just to avoid that question, but if that's important we can assume it has enough space just for one type of header and one type of data plus twice the max alignment. – CppNerd13373 Oct 14 '21 at 18:05
  • @Mgetz That target doesn't seem correct to me. OP is not asking about flexible array members specifically. – cigien Oct 14 '21 at 20:24
  • As far as I remember there's a special rule somewhere in the standard saying every object can, as far as pointer arithmetic is concerned, be considered as an array of length 1, so `this + 1` would be a valid one-past-the-end pointer. – Aconcagua Oct 14 '21 at 23:42
  • @Aconcagua I agree, but where does it say that `reinterpret_cast(this + 1)` or even `reinterpret_cast(this + 1)` is really pointing to the object adjacent to our header object here? Actually you need potentially another alignment offset in between, so that's why there was a need to convert to `std::byte *` and add that alignment with pointer arithmetic, which in turn can itself be undefined. – CppNerd13373 Oct 15 '21 at 00:15
  • Can you add to your question that you are working with very low level machine code? AFAIK, that would help to better identify the correct solution that wouldn't trigger UB. Changing the title to reflect that would also help. – Braiam Oct 15 '21 at 14:55

1 Answers1

0

The only thing that is even hypothetically UB is whether reinterpret_cast<std::byte *>(this) is a pointer to one of the bytes in the array or not. But you can std::launder it to ensure that it is. Indeed, laundering to/from byte pointers is sufficiently useful that you could make template functions for them:

template<typename T>
std::byte *to_byte_ptr(T *ptr)
{
  return std::launder(reinterpret_cast<std::byte*>(ptr));
}

template<typename T>
T *from_byte_ptr(std::byte *ptr)
{
  return std::launder(reinterpret_cast<T*>(ptr));
}

Everything else is perfectly fine. You created header in storage provided by a byte array, so there are std::byte objects there accessible via launder. And since those std::byte objects are in an array, you can do pointer arithmetic on them. And while those std::byte objects don't actually point to the objects they provide storage for, if you have a pointer with the same address as the desired object, you can launder it to retrieve a pointer to that object.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • That's exactly what I was thinking, so thanks for giving me this insight. However intuitive this sounds, I could not find in the standard the confidence that will tell me that laundering this or this+1 after launder indeed does point to an array element of the std::byte array. Also when reading the launder requirements I was puzzled by the reachability requirements. Can you guide me maybe through that? – CppNerd13373 Oct 15 '21 at 09:30