0

Related to a recent question, I wrote the following code:

int main()
{
   char* x = new char[33];
   int* sz = (int*)x;
   sz--;
   sz--;
   sz--;
   sz--;
   int szn = *sz; //szn is 33 :)
}

I do know it's not safe and would never use it, but it brings to mind a question:

Is the following safe? Is it a memory leak?

char* allocate()
{
    return new char[20];
}

int main()
{
    char* x = allocate();
    delete[] x;
}

If it's safe, doesn't that mean we can actually find the size of the array? Granted, not in a standard way, but is the compiler required to store information about the size of the array?

I am not using or plan on using this code. I know it is undefined behavior. I know it isn't guaranteed by anything. It's just a theoretical question!

Community
  • 1
  • 1
Luchian Grigore
  • 253,575
  • 64
  • 457
  • 625
  • 6
    How does the first code snippet relate to the second? Why do you think the second snippet is unsafe? – James McNellis Jan 18 '12 at 19:42
  • @JamesMcNellis the first snippet shows that that specific compiler stores information about the size of the array somewhere. Which brought to mind the second snippet. – Luchian Grigore Jan 18 '12 at 19:48
  • 2
    @JamesMcNellis, the first snippet illustrates how *in some implementations, at least* the size of the allocated memory is stored ahead of the chunk of memory. That goes to his point about the runtime always knowing the size of the chunk of memory when you delete it. – Paul Tomblin Jan 18 '12 at 19:49
  • ubuntu 11.10, on x64 outputs 0 when printing szn. Makes sense since it's UB – BЈовић Jan 18 '12 at 19:50
  • @VJovic I'd expect that, this works for Win7x64 with MSVS 2008, I doubt it works for many other platforms. – Luchian Grigore Jan 18 '12 at 19:51
  • @BЈовић It outputs 0 because on 64-bit, the size type is 8 bytes, and the sz-- statements only jump backwards 4 bytes, so you're not seeing it. On Ubuntu 64 for me however, jumping back 8 bytes causes a segfault. Using MSVC, jumping back 8 bytes returns 33 for me. – nevelis Dec 01 '13 at 08:00

6 Answers6

5

Is the following safe?

Yes, of course that's safe. First snippet has UB however.

If it's safe, doesn't that mean we can actually find the size of the array? Granted, not in a standard way, but is the compiler required to store information about the size of the array?

Yes, generally extra data is stored before the first element. This is used to call the correct number of destructors. It's UB to access this.

required to store information about the size of the array?

No. It only requires delete[] work as expected. new int[10] could simply be a plain malloc call, which would not necessarily store the requested size 10.

Pubby
  • 51,882
  • 13
  • 139
  • 180
1

This is safe, and is not a memory leak. The standards require that delete[] handle the freeing of memory by any array allocation.

If it's safe, doesn't that mean we can actually find the size of the array?

The standards don't put specific requirements on where and how the allocated size is stored. This could be discoverable as shown above, but different compilers/platforms could also use a completely different methodology. As such, it's not safe to rely on this technique to discover the size.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
1

I know that in c, the size of any malloc on the heap resides before the pointer. The code for free relies on this. This is documented in K&R.

But you should not rely on this always being there or always being in the same position.

If you want to know the array length then I would suggest you create a class or struct to record capcity along side the actual array, and pass that around your program where you would previously just pass a char*.

weston
  • 54,145
  • 21
  • 145
  • 203
0
int main()
{
   char* x = new char[33];
   int* sz = (int*)x;
   sz--;
   sz--;
   sz--;
   sz--;
   int szn = *sz; //szn is 33 :)
}

This is an undefined behavior, because you access the memory location that you didn't allocate.

is the compiler required to store information about the size of the array?

No.


If it's safe, doesn't that mean we can actually find the size of the array?

You do not do anything special in the 2nd code snipet, therefore it's safe. But there are no ways to get the size of the array.

BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • Not really helpful. I know it's not standard behavior, as I already noted. And if the compiler isn't required to store information, how can it free the whole array in the second snippet of code? – Luchian Grigore Jan 18 '12 at 19:44
  • 1
    It's not required to store information at that position, though some do. It's undefined because you can't rely on that information being at that position, and you could be accessing completely unrelated data. – matthias Jan 18 '12 at 19:48
  • @Luchian: the same way it can free a single char allocated with `new char`. The need to store the size is because it needs to call the right number of destructors. However, calling the destructor of `char` is a noop, so, following the as-if rule, the compiler can just not call anything, and thus nothing needs to be stored. – R. Martinho Fernandes Jan 18 '12 at 19:53
  • 1
    @LuchianGrigore You say "the compiler", but there are many compilers! And they all work differently! Even different versions of the same compiler could handle this differently. Don't try to make this work. Please. – Mr Lister Jan 18 '12 at 19:54
  • @LuchianGrigore So, your question is OS specific. The compiler stores no information about the array size – BЈовић Jan 18 '12 at 19:55
  • @MrLister **I am not trying to make this work!** What is wrong with having a purely theoretical conversation? I'm just interested in the concept. – Luchian Grigore Jan 18 '12 at 19:55
  • @LuchianGrigore Oh, well, in that case, purely theoretical, you say that it's a 64 bit compiler, right? But do you expect the allocation size of a memory chunk to fit in an int? Is the int 64 bits in size? It should be a size_t, not an int. Oh, and have you tried with other types? wchar_t for instance, does the memory location still say 33, or is it 66? – Mr Lister Jan 18 '12 at 20:01
  • @LuchianGrigore What concept? Of getting an array size through UB? :) – BЈовић Jan 18 '12 at 20:02
  • @VJovic exactly! Excuse me for trying to have some fun with C++! There are a lot of cool hacks you can achieve through UB. Do you think hackers care what is or isn't standard compliant when finding a vulnerability? – Luchian Grigore Jan 18 '12 at 20:03
  • Oh by the way, I don't want to be a spoilsport, but I believe that TPTB frown upon purely theoretcal conversations on this site. It should be questions about problems only. – Mr Lister Jan 18 '12 at 20:05
  • @MrLister you of course have the option to flag the questio, might as well use it. – Luchian Grigore Jan 18 '12 at 20:08
  • @LuchianGrigore Ah no. Nothing against a good fun. However, [you didn't get my point](http://catb.org/jargon/html/N/nasal-demons.html). – BЈовић Jan 18 '12 at 20:09
0

I am not sure it the delete must know the size of the array when the array is allocated with basic types (that doesn't demand a call to the destructor). In visual studio compilers, the value is stored only for user defined objects (in this case, delete[] must know the size of the array, as it must call their destructors).

Where in the memory the size is allocated is undefined (in visual studio it is in the same place of the gcc).

http://www.parashift.com/c++-faq-lite/freestore-mgmt.html#faq-16.14

Renan Greinert
  • 3,376
  • 3
  • 19
  • 29
0

There are two ways to destroy an array, depending on how it was created. In both cases the compiler is required to call a destructor for each element of the array, so the number of elements in the array must be known.

If the array is an automatic variable on the stack, the number of elements is known at compile time. The compiler can hard-code the number of elements in the code it emits for destroying the array.

If the array is dynamically allocated on the heap, there must be another mechanism for knowing the element count. That mechanism is not specified by the standard, nor is it exposed in any other fashion. I think that putting the count at an offset from the front of the array is a common implementation, but it's certainly not the only way, and the actual offset is just a private implementation detail.

Since the compiler must know how many elements are in the array, you'd think it would be possible for the standard to mandate a way of making that count available to programs. Unfortunately this is not possible because the count is only known at destruction time. Imagine that the standard included a function count_of that could access that hidden information:

MyClass array1[33];
MyClass * array2 = new MyClass[33];
cout << count_of(array1) << count_of(array2); // outputs 33 33
Foo(array1);
Foo(array2);
MyClass * not_array = new MyClass;
Foo(not_array);

void Foo(MyClass * ptr)
{
    for (int i = 0; i < count_of(ptr); ++i) // how can count_of work here?
    ...
}

Since the pointer passed to Foo has lost all its context, there's no consistent way for the compiler to know how many elements are in the array, or even if it's an array at all.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622