11

Current draft standard explicitly states that placement new[] can have a space overhead:

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std​::​size_­t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another.

So presumably they have something in mind, why a compiler need this overhead. What is it? Can a compiler use this overhead for anything useful?

In my understanding, to destruct this array, the only solution is to call destructors in a loop (am I right on this?), as there is no placement delete[] (btw, shouldn't we have placement delete[] to properly destruct the array, not just its elements?). So the compiler doesn't have to know the array length.

I thought as this overhead cannot be used for anything useful, compilers don't use it (so this is not an issue in practice). I've checked compilers with this simple code:

#include <stdio.h>
#include <new>

struct Foo {
    ~Foo() { }
};

int main() {
    char buffer1[1024];
    char buffer2[1024];

    float *fl = new(buffer1) float[3];
    Foo *foo = new(buffer2) Foo[3];

    printf("overhead for float[]: %d\n", (int)(reinterpret_cast<char*>(fl) - buffer1));
    printf("overhead for Foo[]  : %d\n", (int)(reinterpret_cast<char*>(foo) - buffer2));
}

GCC and clang doesn't use any overhead at all. But, MSVC uses 8 bytes for the Foo case. For what purpose could MSVC use this overhead?


Here's some background, why I put this question.

There were previous questions about this subject:

As far as I see, the moral of these questions is to avoid using placement new[], and use placement new in a loop. But this solution doesn't create an array, but elements which are sitting next to each other, which is not an array, using operator[] is undefined behavior for them. These questions are more about how to avoid placement new[], but this question is more about the "why?".

geza
  • 28,403
  • 6
  • 61
  • 135
  • 1
    Be careful of interpreting a "difference in behaviour" from this simple test. I pasted your code into godbolt and found that gcc has realised that the call to `placement new[]` is totally redundant, and has removed it! https://godbolt.org/g/94Deyp – Richard Hodges Jul 16 '18 at 07:50
  • @RichardHodges: Hmm, why is it relevant here? – geza Jul 16 '18 at 07:57
  • Note also that gcc is able to "see" that Foo's destructor is a no-op. I would certainly expect it to take advantage of the rules in favour of efficient code. – Richard Hodges Jul 16 '18 at 07:57
  • The point is that the compiler in this case see that there is no need to store a magic number to tell it how long the array is - as it does not need to call N destructors when you call delete[]. gcc is taking full advantage of its knowledge of the float and Foo types in order to optimise memory storage and behaviour of these 2 allocated arrays. – Richard Hodges Jul 16 '18 at 07:59
  • @RichardHodges: Ah, I see. But my example's output is the same with: opimizations turned off, and if I move `new(buffer2) Foo[3];` into a separate function (with `buffer2` as input parameter). So it is not the optimization. – geza Jul 16 '18 at 08:04
  • now remove the *definition* of Foo's destructor. See how the code changes. – Richard Hodges Jul 16 '18 at 08:22
  • 2
    @RichardHodges: it's the same, 0 overhead. – geza Jul 16 '18 at 08:26
  • @RichardHodges: _The point is that the compiler in this case (can) see that there is no need to store a magic number_... In the case of 'regular' `new`, I would expect (but have not tested) that the compiler would include the overhead anyway, regardless of whether the class actually has a destructor or not. I mean nobody really cares and it's just the line of least resistance. But I could be wrong. – Paul Sanders Jul 16 '18 at 13:02
  • @geza Riiiiiiiiiiiight. And why wouldn't it be? - please see my answer. – Paul Sanders Jul 16 '18 at 13:05

4 Answers4

4

Current draft standard explicitly states ...

To clarify, this rule has (probably) existed since first version of the standard (earliest version I have access to is C++03, which does contain that rule, and I found no defect report about needing to add the rule).

So presumably they have something in mind, why a compiler need this overhead

My suspicion is that the standard committee didn't have any particular use case in mind, but added the rule in order to keep the existing compiler(s?) with this behaviour compliant.

For what purpose could MSVC use this overhead? "why?"

These questions could confidently be answered only by the MS compiler team, but I can propose a few conjectures:

The space could be used by a debugger, which would allow it to show all of the elements of the array. It could be used by an address sanitiser to verify that the array isn't overflowed. That said, I believe both of these tools could store the data in an external structure.

Considering the overhead is only reserved in the case of non-trivial destructor, it might be that it is used to store the number of elements constructed so far, so that compiler can know which elements to destroy in the event of an exception in one of the constructors. Again, as far as I know, this could just as well be stored in a separate temporary object on the stack.


For what it's worth, the Itanium C++ ABI agrees that the overhead isn't needed:

No cookie is required if the new operator being used is ::operator new[](size_t, void*).

Where cookie refers to the array length overhead.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • _but added the rule in order to keep the existing compiler(s?) with this behaviour compliant._ Compliant with what? And: _The space could be used by a debugger, which would allow it to show all of the elements of the array._ interesting idea, although maybe a bit of a stretch. And: _Itanium C++ ABI agrees that the overhead isn't needed:_ Also interesting, seems they're wising up. Updated the issue I posted at GitHub. – Paul Sanders Jul 16 '18 at 13:14
  • Maybe MSVC just always stores the number of elements before the array, even when the type has a trivial destructor and so the number of elements isn't actually needed. – Jonathan Wakely Jul 16 '18 at 13:48
  • @PaulSanders _"seems they're wising up"_ that decision was made nearly 20 years ago. – Jonathan Wakely Jul 16 '18 at 13:48
  • @Jonathan Oh yes! Good point, too much wacky baccy for me this morning. What happened with MSVC then? Maybe [this](https://bigthink.com/the-proverbial-skeptic/those-who-do-not-learn-history-doomed-to-repeat-it-really). – Paul Sanders Jul 16 '18 at 13:55
  • They may be unable or unwilling to change their ABI now to remove the overhead for types with trivial destructors. – Jonathan Wakely Jul 16 '18 at 13:57
  • @Jonathan It's not, as far as I can see, an ABI issue (we're talking about _placement_ `new`, remember, so destructors don't come into it). – Paul Sanders Jul 16 '18 at 13:58
  • 1
    @PaulSanders `Compliant with what?` compliant with the C++ standard. To clarify: MSVC pre-exists the standard, and presumably had the behaviour (of adding overhead to placement-array-new), and my conjecture is that the standard rule exists specifically to allow the behaviour so that MSVC (and other compilers having such behaviour if any other ever existed) could become compliant without breaking backwards compatibility. – eerorika Jul 16 '18 at 14:32
  • @JonathanWakely `Maybe MSVC just always stores the number of elements before the array, even when the type has a trivial destructor and so the number of elements isn't actually needed.` That's not the behaviour of MSVC according to geza. In their test, MSVC does *not* store the number of elements when the destructor is trivial (the `float` case). The overhead isn't actually needed by other compilers even when the destructor isn't trivial. – eerorika Jul 16 '18 at 15:05
  • @user2079303 Oh I see, thank you. And: oh my, I hope not. No, I just think it got forgotten about. In fact I know so because I chased it down, see my updated post at GitHub (there was some further discussion there). And now you have said that I see why it got shunted back to the EWG, so thanks for mentioning it. – Paul Sanders Jul 16 '18 at 17:49
  • @PaulSanders Adding a rule to allow overhead doesn't seem to me like something that would come into existence by "forgetting" it. I don't see any discussion in the core issue about the reasons why the rule was specified. – eerorika Jul 16 '18 at 19:23
  • @user2079303 No, I mean they forgot to get _rid_ of it, check out my post at GitHub. Like you, I don't know why it was ever allowed in the first place. Oversight, probably, everyone makes mistakes (and some of mine are lulus :) Or maybe for the reason you first suggested (which, IMO, would be the worst of all possible reasons). – Paul Sanders Jul 16 '18 at 21:06
1

The dynamic array allocation is implementation-specific. But ont of the common practices with implementing dynamic array allocation is storing its size before its beginning (I mean storing size before first element). This perfectly overlaps with:

representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[].

"Placement delete" would not make much sense. What delete does is call destructor and free memory. delete calls destructor on all of the array elements and frees it. Calling destructor explicitly is in some sense "placement delete".

bartop
  • 9,971
  • 1
  • 23
  • 54
0

Current draft standard explicitly states that placement new[] can have a space overhead ...

Yes, beats the hell out of me too. I posted it (rightly or wrongly) as an issue on GitHub, see:

https://github.com/cplusplus/draft/issues/2264

So presumably they have something in mind, why a compiler need this overhead. What is it? Can a compiler use this overhead for anything useful?

Not so far as I can see, no.

In my understanding, to destruct this array, the only solution is to call destructors in a loop (am I right on this?), as there is no placement delete[] (btw, shouldn't we have placement delete[] to properly destruct the array, not just its elements?). So the compiler doesn't have to know the array length.

For the first part of what you say there, absolutely. But we don't need a placement delete [] (we can just call the destructors in a loop, because we know how many elements there are).

I thought as this overhead cannot be used for anything useful, compilers don't use it (so this is not an issue in practice). I've checked compilers with this simple code:

...

GCC and clang doesn't use any overhead at all. But, MSVC uses 8 bytes for the Foo case. For what purpose could MSVC use this overhead?

That's depressing. I really though that all compilers wouldn't do this because there's no point. It's only used by delete [] and you can't use that with placement new anyway, so...

So, to summarise, the purpose of placement new [ ] should be to let the compiler know how many elements there are in the array so that it knows how many constructors to call. And that's all it should do. Period.

Paul Sanders
  • 24,133
  • 4
  • 26
  • 48
-1

(Edit: added more detail)

But this solution doesn't create an array, but elements which are sitting next to each other, which is not an array, using operator[] is undefined behavior for them.

As far as I understand, this is not quite true.

[basic.life]
The lifetime of an object of type T begins when:
(1.1) — storage with the proper alignment and size for type T is obtained, and
(1.2) — if the object has non-vacuous initialization, its initialization is complete

Initialisation of an array consists of initialisation of its elements. (Important: this statement may not be directly supported by the standard. If it is indeed not supported, then this is a defect in the standard which makes creation of variable length arrays other than by new[] undefined. In particular, users cannot write their own replacement for std::vector. I don't believe this to be the intent of the standard).

So whenever there is a char array suitably sized and aligned for an array of N objects of type T, the first condition is satisfied.

In order to satisfy the second condition, one needs to initialise N individual objects of type T. This initialisation may be portably achieved by incrementing the original char array address by sizeof(T) at a time, and calling placement new on the resulting pointers.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • 1
    "Initialisation of an array consists of initialisation of its elements.". I think this may not be true. Like, if you have a class (object), and this class have members (subobject), you need to `new` the class. `new`'ing its subobjects (members) is not enough. Likewise, if one wants to create an array properly, one needs to `new` the array (object), `new`ing its elements (subobject) is not enough. – geza Jul 16 '18 at 10:54
  • *"Initialisation of an array consists of initialisation of its elements."* but not exclusively so. OTOH this isn't an answer to the posted question I'm afraid. – Passer By Jul 16 '18 at 12:57
  • _but not exclusively so_ How so, if it's just a concatenation of objects? – Paul Sanders Jul 16 '18 at 13:08
  • 2
    An array is a special type of object, it's not just a bunch of objects of the same type adjacent to each other. You need to actually create an array, not just create a bunch of objects adjacent to each other. This actually makes `std::vector` undefined (see https://wg21.link/cwg2182 which doesn't do a very good job of describing the issue). – Jonathan Wakely Jul 16 '18 at 13:53
  • @JonathanWakley Please take a look at the quote from the standard I have provided and subsequent reasoning. In my opinion each sentence logically follows from the previous one. Which of the sentences is at fault? – n. m. could be an AI Jul 16 '18 at 16:00
  • @geza when the element type has vacuous initialisation, so does the array itself. So an array has nothing to add to its subobject initialisation. – n. m. could be an AI Jul 16 '18 at 16:04
  • @JonathanWakely oh and I'm totally OK with being in the same boat with vector. That is, if some provision of the standard doesn't allow me to write my own replacement for std::vector, the only intellectually honest action is to reject that provision as broken. – n. m. could be an AI Jul 16 '18 at 16:10
  • @n.m. Broken? Certainly. But being broken doesn't change its meaning. The error with your reasoning is the same one as with casting the result from `malloc` to a type with vacuous initialization and claiming there is an object there: there isn't. Otherwise there would be Schrodinger's objects there. [This](https://stackoverflow.com/questions/49038246/at-what-point-does-the-lifetime-of-a-trivial-type-created-by-placement-new-start) might be a relevant read. – Passer By Jul 16 '18 at 16:55
  • @PasserBy *Otherwise there would be Schrodinger's objects there* There *are* Schrodinger's objects there, per the standard quote. I'm totally fine with them. Why are they a problem? There's actually a paper somewhere (don't remember for C or for C++) that advocates explicit recognition of such objects by the relevant standard, and I'm all for it. – n. m. could be an AI Jul 16 '18 at 16:59