63

I know that there is no way in C++ to obtain the size of a dynamically created array, such as:

int* a;
a = new int[n];

What I would like to know is: Why? Did people just forget this in the specification of C++, or is there a technical reason for this?

Isn't the information stored somewhere? After all, the command

delete[] a;

seems to know how much memory it has to release, so it seems to me that delete[] has some way of knowing the size of a.

jarauh
  • 1,836
  • 22
  • 30
  • 24
    Dynamic arrays are a misdesign of C++. Such facilities are much better provided by a library solution (e.g. `std::vector`). (Of course templates were only added to C++ later than `new`, so that's speaking with hindsight only.) – Kerrek SB Apr 07 '16 at 08:49
  • 12
    `delete[] a` doesn't necessarily need to know the size any more than `free(p)` does. The only reason you might need to know the size is if you need to call destructors, but for `int` there's no such need. – Kerrek SB Apr 07 '16 at 08:52
  • Compilers when C++ was designed had to work with not very much resources. In that context, the C++ compiler was a neat piece of artwork. Nowadays, there is no reason for that and yet the C++ compiler has not changed at all. – Baltasarq Apr 07 '16 at 08:54
  • @Kerrek Ok, so why does `free(p)` not need to know how much memory (starting at p) needs to be freed? – jarauh Apr 07 '16 at 08:54
  • 6
    @jarauh: The practical reason is that many allocators round up the allocated sizes. – MSalters Apr 07 '16 at 08:58
  • 9
    @jarauh: It needs to know how much memory needs to be freed, but not how many "objects" that memory represents, because C does not have destructors to invoke. – Lightness Races in Orbit Apr 07 '16 at 08:58
  • 2
    `C++` allows you to build software at any level you want from the ground up. Most people should not be using raw arrays. Most people should be using `std::vector`. Compiler and Library designers etc... still need the flexibility to decide how they will store size information without the language itself imposing a specific way. – Galik Apr 07 '16 at 09:00
  • 5
    @jarauh 1. There is a discussion about this is comp.std.c at the moment. malloc/free may just call into system allocators that don't enable the user code to find out the allocated block size. 2. free() may know that malloc() gave you 64 bytes, but it has no clue whether that's because you asked for 1 int, or 16. – Martin Bonner supports Monica Apr 07 '16 at 09:03
  • 1
    I'm being pedantic here, but it's actually simple to obtain the size of the dynamic array: `n` or `n * sizeof(*a)` if you want it in bytes. I think you mean to say that there is no way to obtain the size of the dynamically allocated array using just a pointer to it. – eerorika Apr 07 '16 at 09:07
  • sizeof(*a) will be 4, as typeof(*a) is int... – Aconcagua Apr 07 '16 at 09:10
  • @MartinBonner and MSalters Thank you, that seems to be the answer I was looking for. Please write this as an answer, so I can accept it. – jarauh Apr 07 '16 at 09:17
  • @LightnessRacesinOrbit Ok, but if the memory would always be `numberOfObjects*sizeOfObjects`, it would suffice to know the total memory. But apparently, the memory may be larger. – jarauh Apr 07 '16 at 09:18
  • 1
    @Aconcagua: `sizeof(int)` is not always 4. – Lightness Races in Orbit Apr 07 '16 at 09:51
  • [Why do C arrays not keep track of their length?](http://programmers.stackexchange.com/questions/237286) – fredoverflow Apr 07 '16 at 10:46
  • @LightnessRacesinOrbit: Truth, and I know. 4 is just the most common case. Actually I've already seen sizeof(int) == 2 and even == 1 once. What I really wanted to make clear is that sizeof(*a) is not the sizeof the array, but the sizeof its first element... Better would have written == sizeof(int), though. – Aconcagua Apr 07 '16 at 10:58
  • @jarauh: It's up to the system allocator to keep track of that information. The user doesn't have to. – Kerrek SB Apr 07 '16 at 11:07
  • Why would the language require a facility to either (1) tell you what you already know: "it's `n`" or (2) correct a design failure in your objects: "if you wanted to associate this number to this object, why didn't you store it where you could get it?" – Eric Towers Apr 07 '16 at 13:26
  • 1
    @EricTowers (2) because it forces you to save redundant data which C++ doesn't like to do. It also requires you to write boiler plate code to pass around a `struct T_ptr{ T*t; size_t size;}` instead of just a `T*t` which produces cache locality issues. I can see how supporting this could be a legitimate design choice. – nwp Apr 07 '16 at 13:43
  • 1
    @fredoverflow Thanks, but the accepted answer to that question focuses on static arrays. – jarauh Apr 07 '16 at 18:10
  • [How does `delete[]` “know” the size of the operand array?](http://stackoverflow.com/q/197675/995714) – phuclv Apr 08 '16 at 04:16
  • C++ inherited its array semantics directly from C. C doesn't store any array metadata (such as length) in either the array object or any pointer to a dynamically-allocated region of memory, for which [Ritchie had his reasons](https://www.bell-labs.com/usr/dmr/www/chist.pdf)(see the section titled "Embryonic C" starting on page 6). It's up to the dynamic memory manager implementation to keep track of how much memory is associated with a particular pointer. In your code, you know how much memory you asked for, and you're expected to keep track of that data. – John Bode Apr 08 '16 at 13:37
  • @LưuVĩnhPhúc How is this question a duplicate of that one? They ask different things. This question is about the technical reasons why the heap allocator does not *expose* what it knows about an array's size. It is not about how the allocator knows the size: the question already presumes that the allocator knows the size. – Jordan Melo Apr 13 '16 at 21:08
  • 1
    This is **not** a duplicate of the other question. C++ is not C, and it is legitimate to ask why C++ does not let you query the allocation size when `delete[]` invokes the destructor on the correct number of elements (so obviously that size is known). Which is just not the same as `malloc`/ `free` knowing a raw block's size (though admittedly it's a similar thing). – Damon Nov 18 '17 at 19:59

7 Answers7

41

It's a follow on from the fundamental rule of "don't pay for what you don't need". In your example delete[] a; doesn't need to know the size of the array, because int doesn't have a destructor. If you had written:

std::string* a;
a = new std::string[n];
...
delete [] a;

Then the delete has to call destructors (and needs to know how many to call) - in which case the new has to save that count. However, given it doesn't need to be saved on all occasions, Bjarne decided not to give access to it.

(In hindsight, I think this was a mistake ...)

Even with int of course, something has to know about the size of the allocated memory, but:

  • Many allocators round up the size to some convenient multiple (say 64 bytes) for alignment and convenience reasons. The allocator knows that a block is 64 bytes long - but it doesn't know whether that is because n was 1 ... or 16.

  • The C++ run-time library may not have access to the size of the allocated block. If for example, new and delete are using malloc and free under the hood, then the C++ library has no way to know the size of a block returned by malloc. (Usually of course, new and malloc are both part of the same library - but not always.)

  • 4
    even with int, doesn't the size have to be known? To mark the memory free in whatever structures are used to keep evidence of the memory allocated / free? – bolov Apr 07 '16 at 09:00
  • 4
    If it's a mistake, at least it's a forward compatible mistake. If the committee ever decides they want to correct it, they can do so without breaking anything. The reverse would not be true. – Benjamin Lindley Apr 07 '16 at 09:03
  • 18
    @bolov Not necessarily. The size of the allocated block is probably known to the allocator, but that may not be the size which you asked for; a larger block could be allocated for reasons like avoiding fragmentation, or alignment concerns on some platforms. – TartanLlama Apr 07 '16 at 09:13
  • 4
    C++14 requires array-delete to call the sized deallocation function when the array element type is not trivially destructible. [Demo](http://melpon.org/wandbox/permlink/EihMitkUJsyBDt7O) – Kerrek SB Apr 07 '16 at 11:08
  • @KerrekSB Just noticed your comment. 1. The requirement to use array-delete to delete an array is present in all versions of C++. 2. The requirement is absolute - even if the element type *is* trivially destructable (failing to use array delete is undefined behaviour). – Martin Bonner supports Monica Feb 18 '19 at 07:41
  • 1
    @MartinBonner: That wasn't the detail I was referring to. I was specifically referring to the array-deallocation function that is called as a result of the delete expression, and as of C++14, there are multiple candidates (sized vs unsized). As of C++17, there are even more (namely alignment-extended). – Kerrek SB Feb 18 '19 at 10:48
13

One fundamental reason is that there is no difference between a pointer to the first element of a dynamically allocated array of T and a pointer to any other T.

Consider a fictitious function that returns the number of elements a pointer points to.
Let's call it "size".

Sounds really nice, right?

If it weren't for the fact that all pointers are created equal:

char* p = new char[10];
size_t ps = size(p+1);  // What?

char a[10] = {0};
size_t as = size(a);     // Hmm...
size_t bs = size(a + 1); // Wut?

char i = 0;
size_t is = size(&i);  // OK?

You could argue that the first should be 9, the second 10, the third 9, and the last 1, but to accomplish this you need to add a "size tag" on every single object.
A char will require 128 bits of storage (because of alignment) on a 64-bit machine. This is sixteen times more than what is necessary.
(Above, the ten-character array a would require at least 168 bytes.)

This may be convenient, but it's also unacceptably expensive.

You could of course envision a version that is only well-defined if the argument really is a pointer to the first element of a dynamic allocation by the default operator new, but this isn't nearly as useful as one might think.

molbdnilo
  • 64,751
  • 3
  • 43
  • 82
  • 3
    I understand this. What I was wondering about was: When the array gets deleted, suddenly the compiler seems to remember something about the size of the array. What I learned from the above comments is that the compiler only remembers how much memory was reserved, which only gives a lower bound on the size of the array. – jarauh Apr 07 '16 at 10:11
  • 3
    The simpler solution is for all of these uses to be undefined behavior. `size()` could be defined to work only on `new[]` arrays. – John Kugelman Apr 07 '16 at 12:27
  • 2
    @jarauh Try deleting `a[1]`. What would you expect to happen? What actually happens? :) Also, why do you think the *compiler* knows how big the array is? Did the C++ compiler solve the halting problem while I wasn't looking? :D – Luaan Apr 07 '16 at 13:31
  • 2
    You might also want to bring up subarrays. I might use `new` to generate an array of 10 `char`s, but then use it as two separate arrays of 5 `char`s. The compiler has no way of knowing that I have given different semantic meaning to different parts of the array, and so the `size()` function would give the incorrect value. – Eldritch Cheese Apr 07 '16 at 14:26
  • 2
    @Luaan Deleting `&a[1]` can work with trivial objects iff the backing memory allocator accepts free using pointers from anywhere within its allocation range. C++ does not demand this, and requires that you delete from the address you got from `new` – Yakk - Adam Nevraumont Apr 07 '16 at 15:32
  • It would be UB to call `delete[]` on anything other than `p` in that example. Why should we expect `size` to behave any differently? Just say it is only defined when called for something that could legally be `delete[]`ed. Even if it only returns the actual size in memory instead of the requested size, I can see situations where that would be useful. For example, if you have a vector class that would like to know how much storage space it actually has available before it has to re-allocate. – Joel Croteau Aug 25 '19 at 22:20
4

You are right that some part of the system will have to know something about the size. But getting that information is probably not covered by the API of memory management system (think malloc/free), and the exact size that you requested may not be known, because it may have been rounded up.

Carsten S
  • 207
  • 3
  • 14
2

There is a curious case of overloading the operator delete that I found in the form of:

void operator delete[](void *p, size_t size);

The parameter size seems to default to the size (in bytes) of the block of memory to which void *p points. If this is true, it is reasonable to at least hope that it has a value passed by the invocation of operator new and, therefore, would merely need to be divided by sizeof(type) to deliver the number of elements stored in the array.

As for the "why" part of your question, Martin's rule of "don't pay for what you don't need" seems the most logical.

Community
  • 1
  • 1
Jean Louw
  • 126
  • 8
  • 2
    You're assuming that the size of the allocated block is equal to the size of the array, which I don't think is necessarily true. Of course, it cannot be smaller, but I don't believe there's anything that precludes it being larger (for example, if the allocation can only be done by fixed blocked sizes). – Iker Apr 07 '16 at 10:58
  • 2
    C++14 requires that this deallocation function be called by an array-delete expression when the array element type is not trivially destructible. – Kerrek SB Apr 07 '16 at 11:09
  • @KerrekSB I (weakly, without anything but vague memory) thought it was only called if it failed during construction? Can you pull the quote? – Yakk - Adam Nevraumont Apr 07 '16 at 15:33
  • @Yakk: You might be thinking of the placement form. The ordinary `delete` also calls a deallocation function :-) – Kerrek SB Apr 07 '16 at 15:39
2

You will often find that memory managers will only allocate space in a certain multiple, 64 bytes for example.

So, you may ask for new int[4], i.e. 16 bytes, but the memory manager will allocate 64 bytes for your request. To free this memory it doesn't need to know how much memory you asked for, only that it has allocated you one block of 64 bytes.

The next question may be, can it not store the requested size? This is an added overhead which not everybody is prepared to pay for. An Arduino Uno for example only has 2k of RAM, and in that context 4 bytes for each allocation suddenly becomes significant.

If you need that functionality then you have std::vector (or equivalent), or you have higher-level languages. C/C++ was designed to enable you to work with as little overhead as you choose to make use of, this being one example.

DewiW
  • 138
  • 4
1

There's no way to know how you are going to use that array. The allocation size does not necessarily match the element number so you cannot just use the allocation size (even if it was available).

This is a deep flaw in other languages not in C++. You achieve the functionality you desire with std::vector yet still retain raw access to arrays. Retaining that raw access is critical for any code that actually has to do some work.

Many times you will perform operations on subsets of the array and when you have extra book-keeping built into the language you have to reallocate the sub-arrays and copy the data out to manipulate them with an API that expects a managed array.

Just consider the trite case of sorting the data elements. If you have managed arrays then you can't use recursion without copying data to create new sub-arrays to pass recursively.

Another example is an FFT which recursively manipulates the data starting with 2x2 "butterflies" and works its way back to the whole array.

To fix the managed array you now need "something else" to patch over this defect and that "something else" is called 'iterators'. (You now have managed arrays but almost never pass them to any functions because you need iterators +90% of the time.)

Quazil
  • 49
  • 5
-4

The size of an array allocated with new[] is not visibly stored anywhere, so you can't access it. And new[] operator doesn't return an array, just a pointer to the array's first element. If you want to know the size of a dynamic array, you must store it manually or use classes from libraries such as std::vector

Aconcagua
  • 24,880
  • 4
  • 34
  • 59
Gor
  • 2,808
  • 6
  • 25
  • 46