Array placement-new requires unspecified overhead in the buffer?

Question

5.3.4 [expr.new] of the C++11 Feb draft gives the example:

new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).

Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]

Now take the following example code:

void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];

According to the above quote, the second line new (buffer) std::string[10] will internally call operator new[](sizeof(std::string) * 10 + y, buffer) (before constructing the individual std::string objects). The problem is that if y > 0, the pre-allocated buffer will be too small!

So how do I know how much memory to pre-allocate when using array placement-new?

void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];

Or does the standard somewhere guarantee that y == 0 in this case? Again, the quote says:

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

Question from http://chat.stackoverflow.com/transcript/message/2270516#2270516 — Mooing Duck, Jan 04 '12 at 00:05
@JaredKrumsie: The C++11 standard doesn't clarify. Apparently it simply represents any arbitrary value of any arbitrary type. For the purpose of this particular question, I suppose it must represent a `char*`. — Mooing Duck, Jan 04 '12 at 00:38
I don't think you can know that at all. I think placement new was always rather thought of like a tool to use your own memory manager, than something allowing you to pre-allocate memory. Anyway, why don't you simply loop through array with regular `new`? I don't think it will influence performancee much because placement new is basically a no-op, and constructors for all objects in array have to be called separately anyway. — j_kubik, Jan 04 '12 at 01:01
@j_kubik that's not as simple as it looks! If one of the constructors throws midway through the loop you have to clean up the objects you already constructed, something array-new forms do for you. But everything seems to indicate placement-array-new cannot be safely used. — R. Martinho Fernandes, Jan 04 '12 at 01:07
What is the point of the `x` and `y` additional space (had to find the `x` value [here](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3663.html#expr.delete), since it wasn't included)? If it is for when an exception occurs, then it should be stated in the standard. If it is compiler implementation specific, that makes it totally useless from a portability standpoint. — Adrian, Jun 22 '13 at 15:53
@Adrian: The point of the space is _presumably_ so that the implementation can tell how many destructors to call. Without that unspecified space, it would be nearly impossible for `delete[]` to know how many objects there are. — Mooing Duck, Jun 22 '13 at 18:38
Same idea. To be able to determine how many objects to call the destruction on. However, since `delete[]` *requires* this, this should be *defined* in the standard somewhere as to what it contains, or at least that it should be either defined as a class/struct that has particular properties. — Adrian, Jun 23 '13 at 07:20
@Adrian: An implementaiton could also place other information there, such as alignment, or whatever it needs. If it was defined in the standard, it would be impossible to implement correctly in a standards compliant way. That's _why_ they have implementation defined details... — Mooing Duck, Jun 23 '13 at 17:28
True, but it would make for portable code if all of these things could be stipulated in the document and have a means of determining these values through code. I find that C++ is still somewhat of an experimental language, so I think the designers don't want to paint themselves into a corner. But that's the reason for communication between the different stake holders and the designers, to attempt to keep that from happening. — Adrian, Jun 24 '13 at 05:26
@Adrian: it's also plausable that it's designed this way so that an implementation could store the number of destructors to call in one place, and have 0 overhead in another place in the same program, if the value is known elsewhere. — Mooing Duck, Jun 24 '13 at 16:28
That is what would make sense and how I thought that it was done. However, if that were the case, it should be an implementation detail of the `operator new[]` and `operator delete[]` in whatever scope they are located in to deal with this extra overhead internally rather then having this overhead passed along with the minimal required space. I think that was the original intent, but if a constructor throws an exception, this can cause a problem if it's not known how many elements have been constructed. What's really missing from C++ is a way to define how to construct an array of elements. — Adrian, Jun 24 '13 at 18:23
_This overhead may be applied in all array new-expressions, including those referencing the library function operator `new[](std::size_t, void*)`_ Ugh, that's horrible (and I'm not sure I believe it - it's nonsensical). — Paul Sanders, Jul 12 '18 at 05:26
Huh, still in the latest draft standard: http://eel.is/c++draft/expr.new#15. I still don't believe it. — Paul Sanders, Jul 12 '18 at 05:30
Wow, ended up here watching https://youtu.be/IAdLwUXRUvg?t=1337. This is extremely scary, and it ensues that placement new for array is practically un-usable. Are there warnings in place on majors compilers to avoid such catastrophic failure? — Ad N, Oct 31 '19 at 13:05

Howard Hinnant · Accepted Answer · 2021-03-21T21:38:08.420

51

Update

Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for operator new[](std::size_t, void* p).

This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.

Original Answer

Don't use operator new[](std::size_t, void* p) unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.

If you don't know the answer to this question, write your own placement array new that can check this at run time:

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

By varying the array size and inspecting n in the example above, you can infer y for your platform. For my platform y is 1 word. The sizeof(word) varies depending on whether I'm compiling for a 32 bit or 64 bit architecture.

edited Mar 21 '21 at 21:38

answered Jan 04 '12 at 04:13

Howard Hinnant

206,506
52
449
577

theres another good idea I never considered! I believe the spec says it may vary from call to call, but this handles even that correctly! – Mooing Duck Jan 04 '12 at 06:50
How does this account for alignment, though? Is the offset guaranteed to fit into the required alignment? – Kerrek SB Jan 05 '12 at 00:30
Can't you infer `y` simply by taking the difference of the (bytecast) pointers `buffer` and `p`? – Kerrek SB Jan 05 '12 at 01:44
1

@Kerrek SB: You are correct that I was careless with alignment. I've added `alignas` to the client code to make things right. The placement new expression should take care of alignment with respect to the "cookie" and the "data" respectively. For example here is how the Itanium ABI does it (http://sourcery.mentor.com/public/cxx-abi/abi.html#array-cookies). And yes, you can infer `y` as you suggest. Be aware that `y` may be dependent on the alignment of the new'd type, and on whether or not that type has a trivial destructor (and other platforms may have other details). – Howard Hinnant Jan 05 '12 at 02:11
5

@HowardHinnant: I'm still baffled that the *placement* version requires any cookie at all. What's it for? What's in it? After all, the *only* way you can destroy those array elements is by hand, isn't it? Your link even says that there's no cookie for the placement version `(size_t, void*)`. Do you think the non-zeroness of the cookie should be a defect report? – Kerrek SB Jan 05 '12 at 02:14
2

@Kerrek SB: Well that's a good question and I'm not sure I have a good answer for it. I suppose that some hypothetical user-written placement delete, which is called in case there is an exception thrown during the default construction of each element, might make use of the cookie during clean up. But I don't have a good example of such a case in my back pocket. And even if such a hypothetical user-written placement delete existed, it would necessarily be platform dependent. On the bright side, it is legal for sizeof(y) to be 0. :-) – Howard Hinnant Jan 05 '12 at 03:50
2

If you would like to submit a defect report on this, it should be aimed at the CWG (as opposed to the LWG). Here is the CWG issues list: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html . And your best strategy for submitting an issue is to email the author of that list. I don't know if an issue demanding `y == 0` always would be successful if for no other reason than backwards compatibility with established ABI's such as the Itanium ABI. Breaking ABI at this low level is very daunting. – Howard Hinnant Jan 05 '12 at 03:54
1

@HowardHinnant: Thanks! I posted a DR to the standard mailing list for now, let's see if it makes it past the moderators! Unfortunately I can't reproduce any case where `y` is ever non-zero, but I don't have access to an Itanium, alas. – Kerrek SB Jan 05 '12 at 04:37
5

It seems that here is already [a defect report](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#476) on this matter! D'oh... – Kerrek SB Jan 05 '12 at 04:56
Wow, after 7 years of no action maybe you're the person to write that requested paper! :-) – Howard Hinnant Jan 05 '12 at 14:37
@HowardHinnant: Who, me? I made a DR post on the mailing list, but no response as of yet. – Kerrek SB Jan 11 '12 at 21:37
@KerrekSB: The cookie is needed when the compiler decides it needs the length of the allocated array. Note well: when it _so decides_, and thus it is _not_ always present. Notably, when objects with trivial destructors are involved, the cookie might just be omitted. This happens with _any_ placement overloads, which means that _all of them_ should be avoided for arrays, not only the "buffer" overload for `void*`. – alecov Aug 24 '16 at 19:57
@alecov _Array placement-new_ (`void* operator new[]( std::size_t count, void* ptr );`) does **not** allocate memory. It's a no-op. – bit2shift Nov 08 '18 at 20:46
@alecov that there's a defect in the standard that has been ignored for far too long. – bit2shift Nov 09 '18 at 23:25
@HowardHinnant: FWIW, this has been changed in C++20. The non-allocating placement `operator new[]` doesn't deal with these offsets anymore. – Nicol Bolas Mar 21 '21 at 19:32
Thanks! I've updated the answer with this information. – Howard Hinnant Mar 21 '21 at 21:38

Kerrek SB · Answer 2 · 2013-06-02T13:08:52.897

Update: After some discussion, I understand that my answer no longer applies to the question. I'll leave it here, but a real answer is definitely still called for.

I'll be happy to support this question with some bounty if a good answer isn't found soon.

I'll restate the question here as far as I understand it, hoping that a shorter version might help others understand what's being asked. The question is:

Is the following construction always correct? Is arr == addr at the end?

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

We know from the standard that #1 causes the call ::operator new[](???, addr), where ??? is an unspecified number no smaller than N * sizeof(T), and we also know that that call only returns addr and has no other effects. We also know that arr is offset from addr correspondingly. What we do not know is whether the memory pointed to by addr is sufficiently large, or how we would know how much memory to allocate.

You seem to confuse a few things:

Your example calls operator new[](), not ~~operator new()~~.
The allocation functions do not construct anything. They allocate.

What happens is that the expression T * p = new T[10]; causes:

a call to operator new[]() with size argument 10 * sizeof(T) + x,
ten calls to the default constructor of T, effectively ::new (p + i) T().

The only peculiarity is that the array-new expression asks for more memory than what is used by the array data itself. You don't see any of this and cannot make use of this information in any way other than by silent acceptance.

If you are curious how much memory was actually allocated, you can simply replace the array allocation functions operator new[] and operator delete[] and make it print out the actual size.

Update: As a random piece of information, you should note that the global placement-new functions are required to be no-ops. That is, when you construct an object or array in-place like so:

T * p = ::new (buf1) T;
T * arr = ::new (buf10) T[10];

Then the corresponding calls to ::operator new(std::size_t, void*) and ::operator new[](std::size_t, void*) do nothing but return their second argument. However, you do not know what buf10 is supposed to point to: It needs to point to 10 * sizeof(T) + y bytes of memory, but you cannot know y.

You should expand on the difference between what `new` does and the `operator new` function. Until the linked convo, I thought `new` was simply syntactic sugar. Also, the calls to `operator new` instead of `operator new[]` were typos. I did it AGAIN in this comment :( — Mooing Duck, Jan 04 '12 at 00:08
@MooingDuck: I, and others, have done so countless times before on SO. I recommend a Good Book, or searching SO. — Kerrek SB, Jan 04 '12 at 00:09
But what about `new(buf) T[10]`? How do you make `buf` big enough? (Coming from the chat discussion I know this is *the actual intended question*, but it was not made clear :( ) — R. Martinho Fernandes, Jan 04 '12 at 00:11
@R.MartinhoFernandes: You're absolutely right; I've amended the answer, and basically I don't have an answer to the question now. I won't delete this unless someone takes exception to it, but we definitely need a proper answer still. — Kerrek SB, Jan 04 '12 at 01:03
Just to clarify, we agree that `::new(buf) T[n]` requires exactly `sizeof(T[n])` bytes, right? And that it's the unqualified call, `new(buf) T[n]`, that is unspecified? — GManNickG, Jan 04 '12 at 01:28
@GMan: No! On the contrary: We have no idea how much memory is required by `::new (buf) T[n]`! That's what the initial quote from 5.3.4 says: We call `::operator new[](sizeof(T) * n + y, buf)`, with no knowledge about `y`. — Kerrek SB, Jan 04 '12 at 01:32
@KerrekSB: I think there's a contradiction in your answer. First you say `We also know that arr is offset from addr correspondingly` for `T * arr = ::new (addr) T[N];` then you say `arr == buf10` for `T * arr = ::new (buf10) T[10];` ... which is it? — etherice, Jun 02 '13 at 12:20
@etherice: You're right, I'm not really sure why I wrote that. The global placement allocation function is a no-op, but you can't control the amount of space that's required. So `buf10` needs to point to `10 * sizeof(T) + y` bytes of memory, but you cannot know `y`. I'll edit this. — Kerrek SB, Jun 02 '13 at 13:07

M.M · Answer 3 · 2016-04-14T22:39:42.257

As mentioned by Kerrek SB in comments, this defect was first reported in 2004, and it was resolved in 2012 as:

The CWG agreed that EWG is the appropriate venue for dealing with this issue.

Then the defect was reported to EWG in 2013, but closed as NAD (presumably means "Not A Defect") with the comment:

The problem is in trying to use array new to put an array into pre-existing storage. We don't need to use array new for that; just construct them.

which presumably means that the suggested workaround is to use a loop with a call to non-array placement new once for each object being constructed.

A corollary not mentioned elsewhere on the thread is that this code causes undefined behaviour for all T:

T *ptr = new T[N];
::operator delete[](ptr);

Even if we comply with the lifetime rules (i.e. T either has trivial destruction, or the program does not depend on the destructor's side-effects), the problem is that ptr has been adjusted for this unspecified cookie, so it is the wrong value to pass to operator delete[].

score 7 · Answer 4 · answered Jan 04 '12 at 05:58

7

Calling any version of operator new[] () won't work too well with a fixed size memory area. Essentially, it is assumed that it delegates to some real memory allocation function rather than just returning a pointer to the allocated memory. If you already have a memory arena where you want to construct an array of objects, you want to use std::uninitialized_fill() or std::uninitialized_copy() to construct the objects (or some other form of individually constructing the objects).

You might argue that this means that you have to destroy the objects in your memory arena manually as well. However, calling delete[] array on the pointer returned from the placement new won't work: it would use the non-placement version of operator delete[] ()! That is, when using placement new you need to manually destroy the object(s) and release the memory.

answered Jan 04 '12 at 05:58

Dietmar Kühl

150,225
13
225
380

1

Good point about placement operator delete[](). @Mooing Duck: pay attention to it. – Sergey Podobry Jan 04 '12 at 06:17
1

I'm aware that placement-newed objects have to be delted manually. uninitialized_fill is a good idea, but you seem to be saying that the overloaded operator new for arrays that takes a buffer in the C++ spec wont work for what it's designed for. Is that what you're saying? (That _is_ what chat determined.) – Mooing Duck Jan 04 '12 at 06:49
2

placement operator new[]() is working what it is intended for: allocate memory in a way using additional arguments and constructing objects in this memory. What doesn't seem to work portably is the version which only takes a void* to already allocated memory. Given that you wouldn't know where the objects end up at it seems questionable anyway. – Dietmar Kühl Jan 04 '12 at 07:01
2

The entire point is that only the standard `delete[]` operator requires the information that is stored in the extra bytes (both for going through the array, invoking each element's destructor, and for passing the size of the array to the deallocation function, if it needs it). The interesting question for me is now whether the standard actually says so, or if we've found a defect. – Simon Richter Jan 04 '12 at 08:08
I don't think this qualifies as a defect. However, I agree that the standard may be enhanced to remove the possibility of using more memory than the objects need. – Dietmar Kühl Jan 04 '12 at 08:22

score 4 · Answer 5 · answered Mar 21 '21 at 19:21

Note that C++20 changes this answer.

C++17's (and before) [expr.new]/11 clearly says that this function may get an implementation defined offset to its size:

When a new-expression calls an allocation function and that allocation has not been extended, the new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array.

This permits, but does not require, that the size given to the array allocation function could be increased from sizeof(T) * size.

C++20 explicitly disallows this. From [expr.new]/15:

When a new-expression calls an allocation function and that allocation has not been extended, the new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array and the allocation function is not a non-allocating form ([new.delete.placement]).

Emphasis added. Even the non-normative note you quoted was changed:

This overhead may be applied in all array new-expressions, including those referencing a placement allocation function, except when referencing the library function operator new[](std::size_t, void*).

But other forms of placement new (i.e not the specified non-allocating form) may still incur an overhead? — ph3rin, Mar 23 '21 at 00:11

bit2shift · Answer 6 · 2019-10-12T20:50:08.110

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

This is a defect in the standard. Rumor has it they couldn't find a volunteer to write an exception to it (Message #1173).

The non-replaceable array placement-new cannot be used with delete[] expressions, so you need to loop through the array and call each destructor.

The overhead is targetted at the user-defined array placement-new functions, which allocate memory just like the regular T* tp = new T[length]. Those are compatible with delete[], hence the overhead that carries the array length.

score 1 · Answer 7 · answered Jan 04 '12 at 22:54

After reading corresponding standard sections I am satarting to think that placement new for array types is simply useless idea, and the only reason for it being allowed by standard is generic way in which new-operator is described:

The new expression attempts to create an object of the typeid (8.1) or newtypeid to which it is applied. The type of that object is the allocated type. This type shall be a complete object type, but not an abstract class type or array thereof (1.8, 3.9, 10.4). [Note: because references are not objects, references cannot be created by newexpressions. ] [Note: the typeid may be a cvqualified type, in which case the object created by the newexpression has a cvqualified type. ]

new-expression: 
    ::(opt) new new-placement(opt) new-type-id new-initializer(opt)
    ::(opt) new new-placement(opt) ( type-id ) new-initializer(opt)

new-placement: ( expression-list )

newtypeid:
    type-specifier-seq new-declarator(opt)

new-declarator:
    ptr-operator new-declarator(opt)
    direct-new-declarator

direct-new-declarator:
    [ expression ]
    direct-new-declarator [ constant-expression ]

new-initializer: ( expression-list(opt) )

To me it seems that array placement new simply stems from compactness of the definition (all possible uses as one scheme), and it seems there is no good reason for it to be forbidden.

This leaves us in a situation where we have useless operator, which needs memory allocated before it is known how much of it will be needed. The only solutions I see would be to either overallocate memory and hope that compiler will not want more than supplied, or re-allocate memory in overriden array placement new function/method (which rather defeats the purpose of using array placement new in the first place).

To answer question pointed out by Kerrek SB: Your example:

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

is not always correct. In most implementations arr!=addr (and there are good reasons for it) so your code is not valid, and your buffer will be overrun.

About those "good reasons" - note that you are released by standard creators from some house-keeping when using array new operator, and array placement new is no different in this respect. Note that you do not need to inform delete[] about length of array, so this information must be kept in the array itself. Where? Exactly in this extra memory. Without it delete[]'ing would require keeping array length separate (as stl does using loops and non-placement new)

There is no placement-delete, though, so that last argument doesn't really work... — Kerrek SB, Jan 05 '12 at 00:28
This is true, but i guess placement or not it should still produce binary-identical structure in memory. — j_kubik, Jan 05 '12 at 21:23
Not in the least! The binary structure isn't mandated anywhere, and it isn't even the same for all standard array-news -- rather, it depends on the type. — Kerrek SB, Jan 05 '12 at 21:39

Array placement-new requires unspecified overhead in the buffer?

7 Answers7

Update

Original Answer

Linked