1

This question has been bothering me for a while.

If I do int* a = new int[n], for example, I only have an pointer that points to the beginning of array a, but how does C/C++ know about n? I know if I want to pass this array to another function, then I have to pass the length of the array with it, so I guess C/C++ does not really know how long this array is.

I know we can infer the end of a character array char* by looking for the NUL terminator. But is there a similar mechanism for other arrays, like int? Meanwhile, char can be more than a character -- you can also treat it as an integer type. Then how does C++ know where this array ends then?

This question starts to bother me even more when I am developing embedded Python (If you are not familiar with embedded python, you may ignore this paragraph and just answer the above questions. I will still appreciate it). In Python there is a "ByteArray", and the only way to convert this "ByteArray" to C/C++ is to use PyString_AsString() to convert it to char*. But if this ByteArray has 0 in it, then C/C++ would think that char* array stops early. This is not the worst part. The worst part is, say I do a

char* arr = PyString_AsString(something)
void* pt = calloc(1, 1000); 

if st happens to start with 0, then C/C++ will almost guarantee to wipe out everything in arr, since it thinks arr ends right after a NULL appears. Then it might just wipe out everything in arr by allocating a a trunk of memory to pt.

Thank you very much for your time! I really appreciate it.

CuriousMind
  • 15,168
  • 20
  • 82
  • 120
  • 6
    Answered for c as [C programming : How does free know how much to free?](http://stackoverflow.com/q/1518711/2509), and that answer holds true in spirit for c++, though the fine detail may differ. – dmckee --- ex-moderator kitten Mar 11 '11 at 01:48
  • What's your question? I can't find it in all the rambling. – Gabe Mar 11 '11 at 01:49
  • 1
    FYI - `0` is not the null terminator. `\0` is. – Brian Roach Mar 11 '11 at 01:50
  • 1
    Only `*print*` (and similar functions) would think `arr` stops early when supplied with your char array--it's intrinsic to the algorithm dealing with C-style char arrays. The C/C++ runtime will not be so confused. See above comment. – Santa Mar 11 '11 at 01:56
  • 1
    It doesn't matter if the buffer pointed to by `arr` has a null character. That will have no affect whatsoever on the call to `calloc` (which takes two arguments), which certainly will not overwrite any part of the previously allocated buffer in any case. Terminating _strings_ with NULL is just a widely accepted convention in C, it has nothing to do with memory management. If you aren't passing `arr` to string functions, then it is perfectly fine to have elements equal to 0. – Alex Mar 11 '11 at 01:58
  • 1
    @Brian: No, it's '\0', which is the same as 0. – Benjamin Lindley Mar 11 '11 at 02:03
  • possible duplicate of [How does delete\[\] "know" the size of the operand array?](http://stackoverflow.com/questions/197675/how-does-delete-know-the-size-of-the-operand-array) – Greg Hewgill Mar 11 '11 at 02:09

3 Answers3

7

C/C++ doesn't; it's the allocator (the little piece of code that implements malloc(), free(), etc.) that knows how long it is. C/C++ is welcome to wee all over itself, free of the constraints of having to worry about the length.

Also, PyString_AsStringAndSize().

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • In other words, the language does not. A runtime for that language does. – Santa Mar 11 '11 at 01:57
  • "wee all over itself" That just made my day. This deserves a thumbs up. – Eric Pauley Mar 11 '11 at 02:51
  • That is why you should not mix new-delete and malloc-free pairs. – vahapt Mar 11 '11 at 06:38
  • @Santa: The runtime might not have to store it either, it could be that it goes directly to the OS for each allocation and deallocation. That's why we cannot ask the runtime for the size, it isn't required to know! – Bo Persson Mar 11 '11 at 16:31
4

Let's hit the disassembler! This is going to be different for C and C++. How free works in C is covered in another question, and here's how it works in C++:

struct T {
    ~T();
    int data;
};
void test(T* p)
{
    delete[] p;
}

And let's run the compiler to produce assembly. Here's the relevant bits, compiled for i386:

    movl    -4(%edi), %eax
    leal    (%edi,%eax,4), %esi
    cmpl    %esi, %edi
    je      L4
    .align 4,0x90
L8:
    subl    $4, %esi
    movl    %esi, (%esp)
    call    L__ZN1TD1Ev$stub
    cmpl    %esi, %edi
    jne     L8

You can see the important part: There is an integer stored before the start of p containing the length of p, and the code then loops over the p array, calling the destructor for each item in the array. It then calls delete, which is usually fairly boring because it just calls free (the C function). So you can see how C++ delete is expressed in terms of free.

Destructors and Exceptions: Based on the above assembly, you can notice that if the destructor for T threw an exception, then part of the p array would get the destructor called and the rest of the array would not. Destructors should never throw exceptions.

Caveat: This is only one possible way that your compiler and runtime can solve this problem. (Here, the destructor is called by compiler-generated code and delete is part of the runtime.) There is quite a bit of leeway in how these are implemented, and yours could be different. This also shows why you should always call the correct operator, delete[] or delete -- calling the wrong one will cause all sorts of trouble, such as stomping on memory and freeing invalid pointers.

About NUL terminators: The only reason NUL terminators are a problem is because PyString_AsString and other similar functions call strlen to figure out how long the string is. However, free doesn't care about NUL terminators, instead, it keeps track of the length from the original malloc call separately. For PyString_AsString (and strdup, etc.) this is not an option because there is no portable way to get the size of a region of memory -- malloc and free do not expose this functionality. Besides, you can pass a pointer to PyString_AsString which is in the middle of a malloc block or somewhere else entirely.

Community
  • 1
  • 1
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
0

c/c++ doesn't know the length of any array, so you can cross-border access a array easily. c/c++ doesn't know the length of char array also.

Char* can point to string but it is is not equal to a string. String terminated by NULL is a convention of c/c++.

xjdrew
  • 373
  • 4
  • 13