Let's consider the following (simplified) code for reading contents of a binary file:
struct Header
{
char signature[8];
uint32_t version;
uint32_t numberOfSomeChunks;
uint32_t numberOfSomeOtherChunks;
};
void readFile(std::istream& stream)
{
// find total size of the file, in bytes:
stream.seekg(0, std::ios::end);
const std::size_t totalSize = stream.tellg();
// allocate enough memory and read entire file
std::unique_ptr<std::byte[]> fileBuf = std::make_unique<std::byte[]>(totalSize);
stream.seekg(0);
stream.read(reinterpret_cast<char*>(fileBuf.get()), totalSize);
// get the header and do something with it:
const Header* hdr = reinterpret_cast<const Header*>(fileBuf.get());
if(hdr->version != expectedVersion) // <- Potential UB?
{
// report the error
}
// and so on...
}
The way I see this, the following line:
if(hdr->version != expectedVersion) // <- Potential UB?
contains undefined behavior: we're reading version
member of type uint32_t
which is overlaid on top of an array of std::byte
objects, and compiler is free to assume that uint32_t
object does not alias anything else.
The question is: is my interpretation correct? If yes, what can be done to fix this code? If no, why there's no UB here?
Note 1: I understand the purpose of the strict aliasing rule (allowing compiler to avoid unnecessary loads from memory). Also, I know that in this case using std::memcpy
would be a safe solution - but using std::memcpy
would mean that we have to do additional memory allocations (on stack, or on heap if size of an object is not known).