0

Arrays of any type are implicit-lifetime objects, and it is possible to to begin the lifetime of implicit-lifetime object, without beginning the lifetime of its subobjects.

As far as I am aware, the possibility to create arrays without beginning the lifetime of their elements in a way that doesn't result in UB, was one of the motivations for implicit-lifetime objects, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html.

Now, what is the proper way to do it? Is allocating memory and returning a pointer to array is enough? Or there is something else one needs to be aware of?

Namely, is this code valid and does it create an array with uninitialized members, or we still have UB?

// implicitly creates an array of size n and returns a pointer to it
auto arrPtr = reinterpret_cast<T(*)[]>(::operator new(sizeof(T) * n, std::alignval_t{alignof(T)}) );
// is there a difference between reinterpret_cast<T(*)[]> and reinterpret_cast<T(*)[n]>?
auto arr = *arrPtr; // de-reference of the result in previous line.

The question can be restated as follows.

According to https://en.cppreference.com/w/cpp/memory/allocator/allocate, the allocate function function creates an array of type T[n] in the storage and starts its lifetime, but does not start lifetime of any of its elements.

A simple question - how is it done? (ignoring the constexpr part, but I wouldn't mind if constexpr part is explained in the answer as well).

PS: The provided code is valid (assuming it is correct) for c++20, but not for earlier standards as far as I am aware.

I believe that an answer to this question should answer two similar questions I have asked earlier as well.

  1. Arrays and implicit-lifetime object creation.
  2. Is it possible to allocatate uninialized array in a way that does not result in UB.

EDIT: I am adding few code snippets, to make my question more clear. I would appreciate an answer explaining which one are valid and which ones are not.

PS: feel free to replace malloc with aligned version, or ::operator new variation. As far as I am aware it doesn't matter.

Example #1

T* allocate_array(std::size_t n)
{
    return reinterpret_cast<T*>( malloc(sizeof(T) * n) ); 
    // does it return an implicitly constructed array (as long as 
    // subsequent usage is valid) or a T* pointer that does not "point"
    // to a T object that was constructed, hence UB
    // Edit: if we take n = 1 in this example, and T is not implicit-lifetime 
    // type, then we have a pointer to an object that has not yet been
    // constructed and and doesn't have implicit lifetime - which is bad
}

Example #2.

T* allocate_array(std::size_t n)
{
    // malloc implicitly constructs - reinterpet_cast should a pointer to 
    // suitably created object (a T array), hence, no UB here. 
    T(*)[] array_pointer = reinterpret_cast<T(*)[]>(malloc(sizeof(T) * n) );
    // The pointer in the previous line is a pointer to valid array, de-reference
    // is supposed to give me that array
    T* array = *array_pointer;
    return array;
}

Example #3 - same as 2 but size of array is known.

T* allocate_array(std::size_t n)
{
    // malloc implicitly constructs - reinterpet_cast should a pointer to 
    // suitably created object (a T array), hence, no UB here. 
    T(*)[n] n_array_pointer = reinterpret_cast<T(*)[n]>(malloc(sizeof(T) * n) );
    // The pointer in the previous line is a pointer to valid array, de-reference
    // is supposed to give me that array
    T* n_array = *n_array_pointer;
    return n_array;
}

Are any of these valid?


The answer

While wording of the standard is not 100% clear, after reading the paper more carefully, the motivation is to make casts to T* legal and not casts to T(*)[]. Dynamic construction of arrays. Also, the changes to the standard by the authors of the paper imply that the cast should be to T* and not to T(*)[]. Hence, the accepting the answer by Nicol Bolas as the correct answer for my question.

  • 1
    I see C++ continuously drifts from simple to the land of WTF. – user14063792468 Mar 21 '21 at 12:37
  • Yes. First, almost everything related to pointer arithmetic and casts, turns out to be UB, breaking the simple model of "address in memory" (even though, almost all implementations work with this model, if not all of them). Then, rules upon rules are added to work around these limitations... – Myrddin Krustowski Mar 21 '21 at 13:03
  • I wonder what does a cause for this change? What I mean, is that a computer still follows some standard model of linear memory + processing. Can you give a link to change in pointer ariphmetic? – user14063792468 Mar 21 '21 at 13:30
  • 1
    @user14063792468: The "change" he's talking about has existed since C++03. It's not new. Pointer arithmetic is defined only in the context of arrays of objects (with a single live object being counted as a 1-element array). If you just allocated some memory, there aren't any objects in it, so you can't just do pointer arithmetic on it. – Nicol Bolas Mar 21 '21 at 13:35
  • @NicolBolas Should I quit `C++`? I got some codebase with address ariphmetic ;) – user14063792468 Mar 21 '21 at 16:37
  • @user14063792468: You are in a comment thread discussing one of the rules that probably makes your code OK. Also, if you haven't quit C++ in the nearly 20 years since the notions in question were added, I don't know why you'd suddenly decide to now. – Nicol Bolas Mar 21 '21 at 16:58
  • Basically, one needs pointer arithmetic within allocated memory pointer casting. These should not be UB in first place - at least when the usage is sane. What we end up with is extra ceremony to do same thing, that do not make the code safer either - if you converted pointers in old code base and it worked as intended, suddenly you need to launder it in many cases. And if it did not - then you still end with same UB. It is not like these errors get caught now compile at compile time or run time. May be I am missing something, but it seems this way. – Myrddin Krustowski Mar 21 '21 at 18:18
  • "*Arrays of any type are implicit-lifetime objects*". No, an array is an implicit-lifetime object only if its elements are implicit-type ([basic.life/1](https://eel.is/c++draft/basic.life#1)). "*we still have UB*" If `T` is not an implicit-lifetime type, then it's UB. Otherwise I'll leave it to the language lawyers who have a copy of the actual C++20 standard handy. For more background about the linked P0593R6, see [this](https://stackoverflow.com/questions/8720425/) SO question, and this [not-a-defect](https://cplusplus.github.io/EWG/ewg-closed.html#68) note. – dxiv Mar 21 '21 at 18:19
  • 1
    @dvix - Arrays are implicit-lifetime objects. https://eel.is/c++draft/basic.types "*Scalar types, implicit-lifetime class types ([class.prop]), array types, and cv-qualified versions of these types are collectively called implicit-lifetime types*". It says array types, and says nothing about _vacuous initialization_. The notion of implicit-lifetime is new to c++20 standard, while _vacuous inititialization_ is not. They are not the same. Note, that implicit-lifetime object (an array) can have subobjects that are not implicit-lifetime objects https://eel.is/c++draft/intro.object#note-3. – Myrddin Krustowski Mar 21 '21 at 18:32
  • 1
    @dvix "_Some operations are described as implicitly creating objects within a specified region of storage. For each operation that is specified as implicitly creating objects, that operation implicitly creates and starts the lifetime of zero or more objects of implicit-lifetime types ([basic.types]) in its specified region of storage if doing so would result in the program having defined behavior_" ... "_Such operations do not start the lifetimes of subobjects of such objects that are not themselves of implicit-lifetime types_". – Myrddin Krustowski Mar 21 '21 at 18:33
  • @dvix Please see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html#affected-types. Aggregate types (**arrays with any element type**, aggregate classes with any members). – Myrddin Krustowski Mar 21 '21 at 18:37
  • 1
    @dxiv: Note that some of the answers to that question are no longer valid in C++20. – Nicol Bolas Mar 21 '21 at 19:31
  • @NicolBolas Right, that second part was more for historical context/background. – dxiv Mar 21 '21 at 20:25

1 Answers1

2

The whole point of implicit object creation is that it is implicit. That is, you don't do anything to get it to happen. Once IOC occurs on a piece of memory, you may use the memory as if the object in question exists, and so long as you do that, your code works.

When you get your T* back from allocator_traits<>::allocate, if you add 1 to the pointer, then the function has returned an array of at least 1 element (the new pointer could be the past-the-end pointer for the array). If you add 1 again, then the function has returned an array of at least 2 elements. Etc. None of this is undefined behavior.

If you do something inconsistent with this (casting to a different pointer type and acting as though there is an array there), or if you act as though the array extends beyond the size of the storage that IOC applies to, then you get UB.

So allocator_traits::allocate doesn't really have to do anything, so long as the memory that the allocator allocated implicitly creates objects.


// does it return an implicitly constructed array (as long as 
// subsequent usage is valid) or a T* pointer that does not "point"
// to a T object that notconstructed, hence UB

Neither. It returns a pointer (to type T) to storage into which objects may have been implicitly created already. Which objects have been implicitly created depends on how you use this storage. And merely doing a cast doesn't constitute "using" the storage.

It isn't the reinterpret_cast that causes UB; it's using the pointer returned by an improper reinterpret_cast that's the problem. And since IOC works based on the operation that would have caused UB, IOC doesn't care what you cast the pointer to.

Part and parcel of the IOC rules is the corollary "suitable created object" rule. This rule says that certain operations (like malloc and operator new) return a pointer to a "suitable created object". Essentially it's back to quantum superposition: if IOC retroactively creates an object to make your code work, then these functions retroactively returns a pointer to whichever object that was created that makes your code work.

So if your code uses the pointer as a T* and does pointer arithmetic on that pointer, then malloc returned a pointer to the first element of an array of Ts. How big is that array? That depends: how big was the allocation, and how far did you do your pointer arithmetic? Does it have live Ts in them? That depends: do you try to access any Ts in the array?

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Would you mind to clarify it a bit with code examples (with new and malloc) and not built-in `allocate` function? I am interested how to produce similar behavior. I will add some code snippets to make my question more clear. – Myrddin Krustowski Mar 21 '21 at 18:40
  • @RazielMagius: The whole point of IOC is to make the code you have now that "works" actually work within the object model, but without making the object model not be an object model. I find your cast to a `T(*)[]` dubious (you should just be casting to a `T*`), but I try to avoid understanding details of C-style arrays and how they relate to pointers. – Nicol Bolas Mar 21 '21 at 18:42
  • There is something that bothers me. `malloc` or `new` should return a pointer to suitable created object. When I return `reinterpret_cast(malloc(... ))` what is the suitable created object here? It points to first element of the array, which has not been created yet, (and might not be created at all actually, in case that I want to use the "uninitalized array" to implement something like hash-set (unless it is invalid usage, and I am not sure about it, and I should just use a ). – Myrddin Krustowski Mar 21 '21 at 20:53
  • (Like I might create array[13] and array[37] later, depending on the hash function, but never create array[0]). Does implicit lifetime cover this case as well? Or it is only valid when elements are stored contiguously? – Myrddin Krustowski Mar 21 '21 at 20:53
  • @RazielMagius: Your question has now become "how does implicit object creation work", which is answered in the linked answer. – Nicol Bolas Mar 21 '21 at 20:54
  • As for example in the linked answer - again, we have implicit construction of a float and and an int. There is no question what is constructed. – Myrddin Krustowski Mar 22 '21 at 01:58
  • Interpreting the result of malloc or new as a pointer to array gives me defined behavior (as it is a pointer to valid object)... treating it is address of first element - I am not sure, since the result is not a pointer to to valid object - unless - "storage of array element 0" is a valid object, unrelated whether an actual object exists there or not. – Myrddin Krustowski Mar 22 '21 at 02:01
  • @RazielMagius: "*The trouble is that arrays are slightly different beasts than the ones I see in the examples*" No, they're not. Let's say `malloc` returned a `void*` which is a pointer to an array of `T`. If you `reinterpret_cast` that to a `T*`, this is fine because that's how pointers to arrays work. If instead `malloc` returned a `void*` pointing to the first `T` in that array, and you cast it to a `T*`, that's also fine. So it doesn't matter which one gets returned; you're right either way. – Nicol Bolas Mar 22 '21 at 02:14
  • If it returns a pointer to array, should not it type be `T(*)[]` (possibly size inside of of `[n]`)? But casting `T(*)[]` to `T*` is not legal - array and the first element of that array are not pointer-interconvertible. – Myrddin Krustowski Mar 22 '21 at 05:50
  • There is one problem. At least one implicitly constructed object needs to exist at the moment you use reinterpret_cast I think. as you can see in all the examples so far. It doesn't work with `T` if `T` doesn't have implicit lifetime. – Myrddin Krustowski Mar 22 '21 at 08:07
  • `T* arr = (T*) malloc(sizeof(T) * n)`. If `n=1`, we implicitly construct an object that doesn't necessarily has implicit-lifetime. On the other hand, `retinerpret_cast (malloc(sizeof(T) ) )` implicitly creates array of `T` of size `1` which is allowed. – Myrddin Krustowski Mar 22 '21 at 08:08
  • _Your question has now become "how does implicit object creation work", which is answered in the linked answer._ Yes and no. The trouble is that arrays are slightly different beasts than the ones I see in the examples - so the ambiguity is not completly solved. The thing is that if you try to compare to example in the standard eel.is/c++draft/intro.object#example-3 - then the result of malloc is a pointer to object. Analogously, if my object is an array - the result should be a pointer to an array and not its first element - there is a place for ambiguity. – Myrddin Krustowski Mar 22 '21 at 08:32
  • When I think about it a bit more, as long as I do not dereference the obtained pointer, (at least not before creating objects) and use it for purposes of arithmetic (within reasonable bounds), I do not get UB, even if there is no _object_ at the beginning of the uninitialized array. – Myrddin Krustowski Mar 22 '21 at 13:06
  • Even if the wording of the standard is not clear, implicitly it is meant to be cast to `T*` and not to `T(*)[]` by the paper author, as appears here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html#16535-table-34-cpp17allocator-requirements-tabcpp17allocator and here https://eel.is/c++draft/allocator.requirements#tab:cpp17.allocator-row-18-column-3-example-1 – Myrddin Krustowski Mar 22 '21 at 13:06