11

To overcome alignment issues, I need to memcpy into a temporary. What type should that temporary be? gcc complains that the following reinterpret_cast will break strict aliasing rules:

template <typename T>
T deserialize(char *ptr) {
    static_assert(std::is_trivially_copyable<T>::value, "must be trivially copyable");
    alignas(T) char raw[sizeof(T)];
    memcpy(raw, ptr, sizeof(T));
    return *reinterpret_cast<T *>(raw);
}

(e.g. when T is "long").

I don't want to define a T, since I don't want to construct a T before overwriting it.

In a union, doesn't writing one member then reading another count as undefined behavior?

template<typename T>
T deserialize(char *ptr) {
    union {
        char arr[sizeof(T)];
        T obj;
    } u;

    memcpy(u.arr, ptr, sizeof(T));   // Write to u.arr
    return u.obj;   // Read from u.obj, even though arr is the active member.
}
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
Martin C. Martin
  • 3,565
  • 3
  • 29
  • 36
  • 8
    Have some [cereal](https://uscilab.github.io/cereal/) and don't worry about it. – nwp Sep 02 '16 at 14:16
  • 1
    I have a [language lawyer answer on pointers and aliasing](http://stackoverflow.com/a/12615861/726300), but as a caveat this is a grey area in the Standard. It's supposed to be improved in the future, but I have no idea in what direction. Your program can be [tweaked](http://coliru.stacked-crooked.com/a/1207492fe8779748) to arguably follow the rules to the letter, but I can't say whether compilers will agree or not. At the very least GCC doesn't complain anymore, but that may just be because we confused its aliasing analysis. Sadly I don't have the time to make a proper answer. – Luc Danton Sep 02 '16 at 15:10
  • @LucDanton: I don't think even [`std::launder`](http://en.cppreference.com/w/cpp/utility/launder) provides the facility the OP desires. – Kerrek SB Sep 02 '16 at 15:53
  • @KerrekSB yeah, that’s for a different set of rules. If or when the Standard decides to be conservative and clamp down on the 'creative' interpretations that the current wording allows, then there is indeed nothing to salvage. – Luc Danton Sep 02 '16 at 19:14

2 Answers2

7

What you want is this:

T result;
char * p = reinterpret_cast<char *>(&result);   // or std::addressof(result) !

std::memcpy(p, ptr, sizeof(T));                 // or std::copy!!

return result;

No aliasing violation. If you want a T, you need to have a T. If your type is trivially copyable, then hopefully it is also trivially constructible and there is no cost. In any event, you have to copy the return operand out into the function return value, and that copy is elided, so there's really no extra cost here.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • 1
    The question asked explicitly about doing it without constructing a `T`. – nwp Sep 02 '16 at 14:21
  • Is there any point in casting the address of `result` to `char*`? – eerorika Sep 02 '16 at 14:21
  • @user2079303: Oh, not for `memcpy` I suppose. But that way it works with `std::copy`, too. – Kerrek SB Sep 02 '16 at 14:22
  • I wouldn't say the question is wrong, I'd say the correct answer is "this is not possible". The answers should adapt to the question, not the other way around. – nwp Sep 02 '16 at 14:22
  • Does anyone have a reference to the parts of the standard that disallow both of the options I posted? In particular, I suspect that if one of the union elements is a char array, then you *can* write to that one and access the other one, because char arrays are handled specially, much like they are for pointer casts. But I can't seem to find it in the spec. – Martin C. Martin Sep 02 '16 at 14:25
0

You want to use std::aligned_storage class template. It's been designed to handle this exact problem. Here's a sample solution with some SFINAE based upon your check in your question.

template<class T>
typename std::enable_if<std::is_trivially_copyable<T>::value, T>::type deserialize(const char *data) {
    typename std::aligned_storage<sizeof(T), alignof(T)>::type destination;
    std::memcpy(&destination, data, sizeof(T));
    return reinterpret_cast<T &>(destination);
}
Andrew
  • 603
  • 1
  • 5
  • 13
  • That's still UB, right? The purpose of `aligned_storage` is a different one. – Kerrek SB Sep 02 '16 at 15:52
  • It shouldn't be. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf – Andrew Sep 02 '16 at 15:55
  • Why would one return reference to non-const temporary whilst return type is not reference? : `reinterpret_cast(destination)` -> T – sandthorn Jan 30 '18 at 12:49
  • This causes undefined behaviour. You have not created any `T` objects at any point. (memcpy and reinterpret_cast do not create objects). [See here, including the accepted answer](https://stackoverflow.com/questions/40873520/reinterpret-cast-creating-a-trivially-default-constructible-object) – M.M Aug 02 '18 at 11:36
  • @sandthorn the intent is to initialize the result object by copying from a `T` object at the location of `destination`. (But causes UB since there is no such object) – M.M Aug 02 '18 at 11:39
  • @M.M I guess fortunately we now have a new defense lawyer in c++17 named `std::launder`. If one modifies the return statement to something [*like*](https://godbolt.org/g/eeuGBp) `return *std::launder(reinterpret_cast(&destination) );`, I'm positive that our use case would be in a good shape, ***if and only if*** the address of the argument contains an object of T when it's called. Moreover, the alias retriction rules do allow accessing the byte representation of an object, do they? So `std::memcpy` can not be considered UB what so ever, it blindly accesses byte representation. – sandthorn Aug 02 '18 at 17:43
  • @M.M With `std::launder` magic, perhaps we can do even this one-liner: `return *std::launder(reinterpret_cast(data) );`. – sandthorn Aug 02 '18 at 17:53
  • @sandthorn `launder` does not get around strict aliasing – M.M Aug 02 '18 at 20:34