0

When working binary input/output, we often see/write code similar to this

in_file.read(reinterpret_cast<char *>(&obj), sizeof(obj))
out_file.write(reinterpret_cast<const char*>(&obj), sizeof(obj))

Now, is there any guarantee within the standard that these operations do not lead to undefined behavior? A clarification/explanation why it works would be appreciated.

Rational for the question:

According to cppreference (I hope I understood it correctly), casting T* to char*, unsigned char* and std::byte* and de-referencing the result should be legal. It explicitly states

AliasedType is std::byte, (since C++17) char, or unsigned char: this permits examination of the object representation of any object as an array of bytes.

Now what is not clear to me, is that sizeof(obj) part as well as "examination of the object representation as an array of bytes" imply that there is pointer arithmetic involved... which raises the question - is pointer arithmetic permitted by the standard? (In case of binary io I haven't looked at the actual implementations (they are a but unreadable to me at this point...), if there is alternative method used behind the scenes, feel free to clarify.

In case it does lead to undefined behavior, should bit_cast to std::array be preferred when doing binary IO?

I mean something like this:

auto binary_rep = std::bit_cast<std::array<const char, sizeof(obj)>>(obj);
out_file.write(&binary_rep[0], sizeof(obj));

for writing and

std::array<char, sizeof(obj)> binary_rep;
in_file.read(&binary_rep[0], sizeof(obj));
obj = bit_cast(binary_rep);

for reading?

  • This won't work as soon as you have a pointer (or container, etc.) in `obj`. – lorro Jul 24 '22 at 21:02
  • 2
    It *can* lead to undefined behaviour, depending on the type of `obj` (e.g. is it a trivial type, etc?). The layout of non-trivial class types (e.g. the class has non-trivial constructor/destructor, virtual functions, etc) is implementation defined and objects written using your approach aren't guaranteed to be read in correctly (particularly if the code that reads the object is built with a different compiler than the code which writes the object). If the object contains pointers, writing and reading the pointer in different processes doesn't write/read the pointed-to data. – Peter Jul 24 '22 at 21:04
  • you mean `bit_cast` part? I think `reinterpret_cast` has the same flaw. – Myrddin Krustowski Jul 24 '22 at 21:04
  • Your code is fine. The pointer arithmetic might not be [formally allowed](https://stackoverflow.com/a/62341088/2752075) here, but this is a standard defect, nothing more. – HolyBlackCat Jul 24 '22 at 21:06
  • It takes very little effort to write proper serialization functions for your class, that, if your type is a simple type, will get properly optimized by the compiler into the code you would expect, but also has the benefit of making your code easier to maintain. – Taekahn Jul 24 '22 at 22:00
  • Casting an object pointer to a `char*` pointer for the purpose of accessing the object's raw bytes is well-defined behavior in the C++ standard. On the other hand, performing I/O on an object that is not a trivial type with standard layout is undefined behavior. – Remy Lebeau Jul 25 '22 at 02:37
  • @RemyLebeau - do you mind do to give me a link? I could not find this part in the standard... – Myrddin Krustowski Jul 25 '22 at 04:39

0 Answers0