4

I'm experimenting with malloc & realloc and came up with some code for the following problem:

I want to create a string of unknown size, without setting any limit. I could ask the user for a nr of chars, but I rather resize the str as the user types each character.

So I was trying to do this with malloc + realloc and the idea was that everytime the user enters a new char, I use realloc to request +1 piece of memory to hold the char.

While trying to implement this, I did a mistake and ended up doing the following:

int main () {
    /* this simulates the source of the chars... */
    /* in reality I would do getch or getchar in the while loop below... */

    char source[10];
    int i, j;
    for (i=0, j=65; i<10; i++, j++) { 
            source[i] = j;
    }

    /* relevant code starts here */

    char *str = malloc(2 * sizeof(char)); /* space for 1 char + '\0' */
    int current_size = 1;

    i = 0;
    while(i<10) {
            char temp = source[i];
            str[current_size-1] = temp;
            str[current_size] = '\0';
            current_size++;
            printf("new str = '%s' | len = %d\n", str, strlen(str));
            i++;
    }

    printf("\nstr final = %s\n", str);

    return 0;

} 

Note that the realloc part is not implemented yet.

I compiled and executed this code and got the following output

new str = 'A' | len = 1
new str = 'AB' | len = 2
new str = 'ABC' | len = 3
new str = 'ABCD' | len = 4
new str = 'ABCDE' | len = 5
new str = 'ABCDEF' | len = 6
new str = 'ABCDEFG' | len = 7
new str = 'ABCDEFGH' | len = 8
new str = 'ABCDEFGHI' | len = 9
new str = 'ABCDEFGHIJ' | len = 10

I found these results weird, because I expected the program to crash: str has space for 2 chars, and the code is adding more than 2 chars to the str without requesting more memory. From my understanding, this means that I'm writing in memory which I don't own, so it should give a runtime error.

So... Why does this work?

(The compiler is GCC 4.3.4.)

Thanks in advance.

EDIT: One of the commenters suggesting that a call to free() could lead to the error being signaled. I tried calling free() with the above code and no error resulted from executing the code. However, after adding more items to the source array, and also calling free, the following error was obtained:

* glibc detected ./prog: free(): invalid next size (fast): 0x09d67008 **

MyName
  • 2,136
  • 5
  • 26
  • 37
  • 2
    `sizeof(char)` is 1. You don't need to spell that out. – Kerrek SB May 16 '12 at 07:39
  • 3
    By the way, when implementing stuff like this, it's often wise to start with a rather larger buffer (say, 256 bytes or so) since `malloc(1)` is not very efficient for most allocators. Also, it's commonly worth avoiding calling `realloc()` too often, so the strategy to *double* the buffer size on each realloation is often used. Of course, your program probably won't be performance-criticial, but I wanted to point this out. – unwind May 16 '12 at 07:54

5 Answers5

7

Since you're writing past the allocated memory, your code has undefined behaviour.

The fact that the code happened to not crash once (or even many times) does not change that.

Undefined behaviour doesn't mean that the code has to crash. In your case, there happens to be some memory immediately following str, which you're overwriting. The actual effects of overwriting that memory are not known (you could be changing the value of some other variable, corrupting the heap, launching a nuclear strike etc).

NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • +1 Thanks for the Undefined behaviour link. – MyName May 16 '12 at 08:18
  • I'm marking this reply as the correct awnser, because it was the first to mention that the behaviour is Undefined and the consequences of it. – MyName May 17 '12 at 06:29
4

seems from glibc-2.14, memory allocate will allocate as the size as follows, and it will set border, so when you allocate 2 byte size " char *str = malloc(2 * sizeof(char))", it seems memory allocated is no less than 16 byte, so you may add more items and then cause the programe error.

struct _bucket_dir bucket_dir[] = {

    { 16,   (struct bucket_desc *) 0},

    { 32,   (struct bucket_desc *) 0},

    { 64,   (struct bucket_desc *) 0},

    { 128,  (struct bucket_desc *) 0},

    { 256,  (struct bucket_desc *) 0},

    { 512,  (struct bucket_desc *) 0},

    { 1024, (struct bucket_desc *) 0},

    { 2048, (struct bucket_desc *) 0},

    { 4096, (struct bucket_desc *) 0},

    { 0,    (struct bucket_desc *) 0}};   /* End of list marker */


void *malloc(unsigned int len)
{
    struct _bucket_dir  *bdir;
    struct bucket_desc  *bdesc;
    void            *retval;

    /*
     * First we search the bucket_dir to find the right bucket change
     * for this request.
     */
    for (bdir = bucket_dir; bdir->size; bdir++)
        if (bdir->size >= len)
            break;
    if (!bdir->size) {
        printk("malloc called with impossibly large argument (%d)\n",
            len);
        panic("malloc: bad arg");
    }
    /*
     * Now we search for a bucket descriptor which has free space
     */
    cli();  /* Avoid race conditions */
    for (bdesc = bdir->chain; bdesc; bdesc = bdesc->next) 
        if (bdesc->freeptr)
            break;
    /*
     * If we didn't find a bucket with free space, then we'll 
     * allocate a new one.
     */
    if (!bdesc) {
        char        *cp;
        int     i;

        if (!free_bucket_desc)  
            init_bucket_desc();
        bdesc = free_bucket_desc;
        free_bucket_desc = bdesc->next;
        bdesc->refcnt = 0;
        bdesc->bucket_size = bdir->size;
        bdesc->page = bdesc->freeptr = (void *) cp = get_free_page();
        if (!cp)
            panic("Out of memory in kernel malloc()");
        /* Set up the chain of free objects */
        for (i=PAGE_SIZE/bdir->size; i > 1; i--) {
            *((char **) cp) = cp + bdir->size;
            cp += bdir->size;
        }
        *((char **) cp) = 0;
        bdesc->next = bdir->chain; /* OK, link it in! */
        bdir->chain = bdesc;
    }
    retval = (void *) bdesc->freeptr;
    bdesc->freeptr = *((void **) retval);
    bdesc->refcnt++;
    sti();  /* OK, we're safe again */
    return(retval);
}
vincent
  • 138
  • 4
2

Besides the fact that you're triggering undefined behaviour, which includes but does not mandate "crash", I think your application actually does own the memory you're writing to.

On modern operating systems memory is handled in pages, chunks of memory way larger than two bytes. AFAIK malloc asks the OS for full pages and divides them internally if needed. (Note: implementation dependent, but I think at least glibc operates this way.) So the OS lets you write to the memory, because it's technically yours. Internally, malloc generally divides the page and supplies parts of it at each request. So you might overwrite another variable on your heap. Or write beyond the bounds to, according to malloc's view, memory still waiting to be requested.

I expect a crash only when you try to write to a page that hasn't been allocated from the OS yet or is marked as read only.

user1252434
  • 2,083
  • 1
  • 15
  • 21
1

[Putting the fact that the behavior is undefined for such operation]

The integrity of the heap is often checked when calling realloc or free, and not at each write, you probably didn't override enough to get a crash.

Note that you didn't call free at the end, if you would, you'll probably get the crash.

MByD
  • 135,866
  • 28
  • 264
  • 277
  • +1, interesting. I did an attempt to call free with the above code. No error. I changed the source array size from 10 to 20, and also called free. ERROR! Editing the original post to reflect this. – MyName May 16 '12 at 08:08
0

To add to the previous answer, really does not have space for 2, it's just a pointer in the memory. Somewhere, malloc remembers that it gave out space for 2 characters but that's malloc's internal working.

You can try the following little experiment to see how this works:

Create another pointer just behind the first one.

char *str2 = str + 5;

/* or you could simply malloc another */

char *str2 = malloc(2);

printf("str=%d, str2=%d\n",str,str2);

/* to eyeball the pointers actually received
and note the difference in the two pointers. 
You will need to raise source length to at least
that much to see the results below
*/

and introduce another printf in the loop after the first one:

printf("new str2 = '%s' | len = %d\n", str2, strlen(str2));

Sooner or later str2 will also begin to show the same letters.

HTH

RThomas
  • 10,702
  • 2
  • 48
  • 61
Dinesh
  • 4,437
  • 5
  • 40
  • 77