We are initializing (large) arrays of trivially_copiable
objects from secondary storage, and questions such as this or this leaves us with little confidence in our implemented approach.
Below is a minimal example to try to illustrate the "worrying" parts in the code. Please also find it on Godbolt.
Example
Let's have a trivially_copyable
but not default_constructible
user type:
struct Foo
{
Foo(double a, double b) :
alpha{a},
beta{b}
{}
double alpha;
double beta;
};
Trusting cppreference:
Objects of trivially-copyable types that are not potentially-overlapping subobjects are the only C++ objects that may be safely copied with std::memcpy or serialized to/from binary files with std::ofstream::write()/std::ifstream::read().
Now, we want to read a binary file into an dynamic array of Foo
. Since Foo
is not default constructible, we cannot simply:
std::unique_ptr<Foo[]> invalid{new Foo[dynamicSize]}; // Error, no default ctor
Alternative (A)
Using uninitialized unsigned char
array as storage.
std::unique_ptr<unsigned char[]> storage{
new unsigned char[dynamicSize * sizeof(Foo)] };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << reinterpret_cast<Foo *>(storage.get())[index].alpha << "\n";
Is there an UB because object of actual type Foo
are never explicitly created in storage
?
Alternative (B)
The storage is explicitly typed as an array of Foo
.
std::unique_ptr<Foo[]> storage{
static_cast<Foo *>(::operator new[](dynamicSize * sizeof(Foo))) };
input.read(reinterpret_cast<char *>(storage.get()), dynamicSize * sizeof(Foo));
std::cout << storage[index].alpha << "\n";
This alternative was inspired by this post. Yet, is it better defined? It seems there are still no explicit creation of object of type Foo
.
It is notably getting rid of the reinterpret_cast
when accessing the Foo
data member (this cast might have violated the Type Aliasing rule).
Overall Questions
Are any of these alternatives defined by the standard? Are they actually different?
- If not, is there a correct way to implement this (without first initializing all Foo instances to values that will be discarded immediately after)
Is there any difference in undefined behaviours between versions of the C++ standard? (In particular, please see this comment with regard to C++20)