5

When implementing certain data structures in C++ one needs to be able to create an array that has uninitialized elements. Because of that, having

buffer = new T[capacity];

is not suitable, as new T[capacity] initializes the array elements, which is not always possible (if T does not have a default constructor) or desired (as constructing objects might take time). The typical solution is to allocate memory and use placement new.

For that, if we know the number of elements is known (or at least we have an upper bound) and allocate on stack, then, as far as I am aware, one can use an aligned array of bytes or chars, and then use std::launder to access the members.

alignas(T) std::byte buffer[capacity];

However, it solves the problem only for stack allocations, but it does not solve the problem for heap alloations. For that, I assume one needs to use aligned new, and write something like this:

auto memory =  ::operator new(sizeof(T) * capacity, std::align_val_t{alignof(T)});

and then cast it either to std::byte* or unsigned char* or T*.

// not sure what the right type for reinterpret cast should be
buffer = reinterpret_cast(memory);

However, there are several things that are not clear to me.

  1. The result reinterpret_cast<T*>(ptr) is defined if ptr points an object that is pointer-interconvertible with T. (See this answer or https://eel.is/c++draft/basic.types#basic.compound-3) for more detail. I assume, that converting it to T* is not valid, as T is not necessarily pointer-interconvertible with result of new. However, is it well defined for char* or std::byte?
  2. When converting the result of new to a valid pointer type (assuming it is not implementation defined), is it treated as a pointer to first element of array, or just a pointer to a single object? While, as far as I know, it rarely (if at all) matters in practice, there is a semantic difference, an expression of type pointer_type + integer is well defined only if pointed element is an array member, and if the result of arithmetic points to another array element. (see https://eel.is/c++draft/expr.add#4).
  3. As for lifetimes are concerned, an object of type array unsigned char or std::byte can provide storage for result of placement new (https://eel.is/c++draft/basic.memobj#intro.object-3), however is it defined for arrays of other types?
  4. As far as I knowT::operator new and T::operator new[] expressions call ::operator new or ::operator new[] behind the scenes. Since the result of builtin new is void, how conversion to the right type is done? Are these implementation based or we have well defined rules to handle these?
  5. When freeing the memory, should one use
::operator delete(static_cast<void*>(buffer), sizeof(T) * capacity, std::align_val_t{alignof(T)});

or there is another way?

PS: I'd probably use the standard library for these purposes in real code, however I try to understand how things work behind the scenes.

Thanks.

  • "*as `new T[]` initializes the array elements*" No, it doesn't. `new T[]()` would, but not `new T[]`. I mean, it will default initialize them, so if a default constructor exists, it will be called. But if `T` is a trivial type, it will be left uninitialized. So what exactly do you mean by "uninitialized" here? Do you mean that there are no actual `T`s, or do you want `T`s to exist but have uninitialized values? – Nicol Bolas Mar 14 '21 at 15:28
  • I am interested in having space for instances of T without constructing them. Since they might be destructed later, then 'no actual T' is the correct term. I corrected the `new T` statement. – Myrddin Krustowski Mar 14 '21 at 19:14

2 Answers2

1

pointer-interconvertibility

Regarding pointer-interconvertibility, it doesn't matter if you use T * or {[unsigned] char|std::byte} *. You will have to cast it to T * to use it anyway.

Note that you must call std::launder (on the result of the cast) to access the pointed T objects. The only exception is the placement-new call that creates the objects, because they don't exist yet. The manual destructor call is not an exception.

The lack of pointer-interconvertibility would only be a problem if you didn't use std::launder.

When converting the result of new to a valid pointer type (assuming it is not implementation defined), is it treated as a pointer to first element of array, or just a pointer to a single object?

If you want to be extra safe, store the pointer as {[unsigned] char|std::byte} * and reinterpret_cast it after peforming any pointer arithmetic.

an object of type array unsigned char or std::byte can provide storage for result of placement new

The standard doesn't say anywhere that "providing storage" is required for placement-new to work. I think this term is defined solely to be used in definitions of other terms in the standard.

Consider [basic.life]/example-2 where operator= uses placement-new to reconstruct an object in place, even though type T doesn't "provide storage" for the same type T.

Since the result of builtin new is void, how conversion to the right type is done?

Not sure what the standard has to say about it, but what else can it be other than reinterpret_cast?

freeing the memory

Your approach looks correct, but I think you don't have to pass the size.

HolyBlackCat
  • 78,603
  • 9
  • 131
  • 207
  • I agree that I need `std::launder` on the result, but theoretical point is pointer arithmetic. – Myrddin Krustowski Mar 14 '21 at 19:33
  • While I am not sure that I interpret correctly, but new(... ) implicitly constructs an array of `byte` or `char` while it does not implicitly construct an array of `T`. As for 'providing storage' my wording is confusing. What I mean is that using placement new or explicit destruction will not "mess" an array of unsigned ints, whilie it might "mess" an array of `T`. – Myrddin Krustowski Mar 14 '21 at 19:34
  • After calling `(TArray + i)->~T()` I have an array in invalid state, which is not an array any longer, hence pointer arithmetic theoretically is not defined, and calculations for position of placement new are possibly not defined. – Myrddin Krustowski Mar 14 '21 at 19:37
  • @RazielMagius That's why I suggest doing arithmetic on `char *` instead. – HolyBlackCat Mar 14 '21 at 19:37
  • "Not sure what the standard has to say about it, but what else can it be other than reinterpret_cast." A very clumsy solution would be to allocate `char` array at the location created by operator `new` as it would guarantee the proper alignment, but I am sure it is not the way it is supposed to be. – Myrddin Krustowski Mar 14 '21 at 19:49
  • there are two delete operators that take alignment as parameter, one with size and another one without. Since I used size when allocating using new, I thought that correct way to delete memory would also require providing size? – Myrddin Krustowski Mar 14 '21 at 19:51
  • @RazielMagius *"allocate char array at the location created by operator new as it would guarantee the proper alignment"* What do you mean by "allocating an array" there, and what does it have to do with alignment? If the type is overaligned, the overload of the allocation function with the alignment parameter is used, so the returned pointer must already be properly aligned. – HolyBlackCat Mar 14 '21 at 19:54
  • @RazielMagius *"Since I used size when allocating"* There's no way to not use it. *"thought that correct way to delete memory would also require providing size"* Cppreference is not entirely clear on this, but it says that `delete`-expression calls the size-less overload, so you doing it should be fine too. – HolyBlackCat Mar 14 '21 at 19:55
  • If I am writing a generic collection, it is right assumption that I do not know anything about alignment. I am not sure, but `::operator new(sizeof(T) * capacity, std::align_val_t{alignof(T)})` is supposed to work both for overaligned and "normal" types? I did provide both size and alignment when allocating, what do you mean by _"There is no way to use it"_? – Myrddin Krustowski Mar 14 '21 at 20:06
  • 1
    @RazielMagius Yes, it should work for non-overaligned types too. *"what do you mean by ..."* I said there's no way to **not** use it. How would you allocate memory without specifying size? – HolyBlackCat Mar 14 '21 at 20:43
  • Missed the **not** part, my bad. – Myrddin Krustowski Mar 14 '21 at 22:11
0

I think your premise may be incorrect. If T is a class the default constructor should be called. However that can be blank and if your class contains all POD (plain old data) then nothing will be initialized. I actually count on this all the time because I often don't want things initialized for performance reasons.

I believe there are are a few caveats with this for global data and so forth where some things are zero initialized. But in general heap stuff isn't. You can test it and you will find there's a bunch of garbage in memory, at least when compiled in release mode. Some compilers will initialize memory in debug mode but that's done outside constructors.

For instance you can set data in a custom placement new function and if it's POD it will still be there in the constructor. Some people will argue this is UB but I think the standard says "nothing is done" for POD, which implies no initialization.