Why can't the runtime environment decide to apply delete or delete[] instead of the programmer?

Question

I've read that the delete[] operator is needed because the runtime environment does not keep information about if the allocated block is an array of objects that require destructor calls or not, but it does actually keep information about where in memory is the allocated block stored, and also, of course, the size of the block.
It would take just one more bit of meta data to remember if destructors need to be called on delete or not, so why not just do that?

I'm pretty sure there's a good explanation, I'm not questioning it, I just wish to know it.

Are you asking why there is `delete[]` instead of `delete` handling the array deallocation and destruction? — Gene Bushuyev, Dec 06 '11 at 18:56
It is a small cost, but then, the gain for paying it is also tiny. — Don Reba, Dec 06 '11 at 18:58
Unless the standard explicitly says that the size of the allocation is stored somewhere (I doubt it), it is not always necessary to store the allocated size. Take the following example: `int * p = new int;` The only way to delete this in a defined way is to call delete on an `int *`. Thus the size is implicitly known from the fact you know it is an `int`. (Note, calling delete on a `void *` is UB: http://stackoverflow.com/a/941959/239916). — Thomas Eding, Dec 06 '11 at 18:59
@ trinithis -- both operators `delete` and `delete[]` take a `void*`, they have no knowledge of type. — Gene Bushuyev, Dec 06 '11 at 19:11
@GeneBushuyev, you are confusing the signature of the (possibly user-provided) implementation of the delete operator and the invocation of the operator. The standard allows for the implementation to take an optional size argument, which is set to the size of the object deleted, which is either inferred from the static type information (hence the *UB* for `void *`) or passed by the dtor in the vtable for polymorphic objects. — Simon Richter, Dec 06 '11 at 19:19
For an example of an allocator requiring the size of the freed block: [AmigaOS FreeMem()](http://www.ummon.eu/En/Amiga/API/AmigaOS/Libraries/exec.library/FreeMem.html). — Simon Richter, Dec 06 '11 at 19:25
@Simon Richter -- there is no optional size argument, see 18.6 of the standard for the signatures. UB has nothing to do with `void*` argument, the standard explains that compiler and run-time are not required to validate or do anything special if deleted pointer is not a valid object, created by appropriate `new` or `new[]` — Gene Bushuyev, Dec 06 '11 at 19:27
Sorry, these exist for member operator delete only. IMO, this is a bug in the specification. — Simon Richter, Dec 06 '11 at 19:54
This whole thread is messed up by unclear questions and imprecise answers. People are confusing delete-expression and operator delete. And I still don't know what the OP wants to know. — Gene Bushuyev, Dec 06 '11 at 20:01
@trinithis: I get your example, but what about `int *p = new int[34]` how does `delete[]` know that it has to free (at least) 34 * sizeof(int) ? — Petruza, Dec 07 '11 at 15:41
@Petruza: It does store the size. http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.14 — Thomas Eding, Dec 07 '11 at 18:29

score 10 · Answer 1 · edited Aug 10 '14 at 15:41

10

I think the reason is that C++ doesn't force you into anything you don't want. It would add extra metadata and if someone didn't use it, that extra overhead would be forced upon them, in contrast to the design goals of the C++ language.

When you want the capability you described, C++ does provide a way. It's called std::vector and you should nearly always prefer it, another sort of container, or a smart pointer over raw newand delete.

edited Aug 10 '14 at 15:41

Unihedron

10,902
13
62
72

answered Dec 06 '11 at 18:49

Mark B

95,107
10
109
188

Probably... although one could argue that array semantics in C were already broken so it's just going on from there. I am afraid that leaving the size out was a premature micro-optimization. – Matthieu M. Dec 06 '11 at 18:53
I don't think this was the question. – Gene Bushuyev Dec 06 '11 at 18:56
About `std::vector`... not quite because it can dynamically grow. A simpler container would work. Nonetheless, close enough. – Thomas Eding Dec 06 '11 at 19:05

score 4 · Answer 2 · answered Dec 06 '11 at 18:54

C++ lets you be efficient as possible so if they did have to track the number of elements in a block that would just be an extra 4 bytes used per block.

This could be useful to a lot of people, but it also prevents total efficiency for people that don't mind putting [].

It's similar to the difference between c++ and Java. Java can be much faster to program because you never have to worry about garbage collection, but C++, if programmed correctly, can be more efficient and use less memory because it doesn't have to store any of those variables and you can decide when to delete memory blocks.

Now go and check any general purpose heap allocator, and see how much overhead it has for each allocation. I wouldn't be surprised by numbers like 16-64 bytes for each allocation, so 4-8 bytes for size will nobody notice. And again, probably most allocators do keep track of allocations anyway. — Gene Bushuyev, Dec 06 '11 at 19:06

smparkes · Answer 3 · 2011-12-06T23:40:54.060

It basically comes down to the language design not wanting to put too many restrictions on implementors. Many C++ runtimes use malloc() for ::operator new () and free() (more or less) for ::operator delete (). Standard malloc/free don't provide the bookkeeping necessary for recording a number of elements and provide no way of determining the malloc'd size at free time. Adding another level of memory manipulation between new Foo and malloc for every single object is, from the C/C++ point of view, a pretty big jump in complexity/abstraction. Among other things, adding this overhead to every object would screw up some memory management approaches that are designed knowing what the size of objects are.

Matthieu M. · Accepted Answer · 2014-08-10T15:58:41.250

There are two things that need be cleared up here.

First: the assumption that malloc keeps the precise size you asked.

Not really. malloc only cares about providing a block that is large enough. Although for efficiency reasons it won't probably overallocate much, it will still probably give you a block of a "standard" size, for example a 2^n bytes block. Therefore the real size (as in, the number of objects actually allocated) is effectively unknown.

Second: the "extra bit" required

Indeed, the information required for a given object to know whether it is part of an array or not would simply be an extra bit. Logically.

As far as implementation is concerned though: where would you actually put that bit ?

The memory allocated for the object itself should probably not be touched, the object is using it after all. So ?

on some platform, this could be kept in the pointer itself (some platforms ignore a portion of the bits), but this is not portable
so it would require extra storage, at least a byte, except that with alignment issues it could well amount to 8 bytes.

~~Demonstration:~~ (not convincing as noted by sth, see below)

// A plain array of doubles:
+-------+-------+-------
|   0   |   1   |   2   
+-------+-------+-------

// A tentative to stash our extra bit
+-------++-------++-------++
|   0   ||   1   ||   2   ||
+-------++-------++-------++

// A correction since we introduced alignment issues
// Note: double's aligment is most probably its own size
+-------+-------+-------+-------+-------+-------
|   0   |  bit  |   1   |  bit  |   2   |  bit
+-------+-------+-------+-------+-------+-------

Humpf!

EDIT

Therefore, on most platforms (where address do matter), you will need to "extend" each pointer, and actually double their sizes (alignment issues).

Is it acceptable for all pointers to be twice as large only so that you can stash that extra bit? For most people I guess it would be. But C++ is not designed for most people, it is primarily designed for people who care about performance, whether speed or memory, and as such this is not acceptable.

END OF EDIT

So what is the correct answer ? The correct answer is that recovering information that the type system lost is costly. Unfortunately.

You wouldn't really need to store a bit for every array element since it's not valid to `delete` a pointer to an element "inside" the array. Only for the first element you need to be able to determine if a whole array is following it not. — sth, Dec 06 '11 at 19:26
@sth: true, the demonstration is flawed... and I gave so much love to my ASCII drawing :( I corrected, thanks for the notice. — Matthieu M., Dec 06 '11 at 19:36
Again, have you checked the overhead of gneral purpose heap allocators? — Gene Bushuyev, Dec 06 '11 at 19:43
@GeneBushuyev: what kind of allocator ? If we look at [jemalloc](http://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf) for example (figure 4 and accompanying text), the wasted space is measured in terms of bit. It is relatively easy to special case small allocations to be efficient. — Matthieu M., Dec 07 '11 at 07:20

Why can't the runtime environment decide to apply delete or delete[] instead of the programmer?

4 Answers4