3

I know there are tons of other realloc questions and answers and I have read almost all of them, but I still couldn't manage to fix my problem.

I decided to stop trying when I accidentaly discovered a very strange behaviour of my code. I introduced a line to try something, but although I don't use the value of newElems in main, the line changes the behaviour.

When the line is commented, the code fails at first realloc. Including the line, the first realloc works. (it still crashes on the second one).

Any ideas on what might be happening?

int main(int argc, char** argv) {
    Pqueue q = pqueue_new(3);
    Node a = {.name = "a"}, b = {.name = "b"},
         c = {.name = "c"}, d = {.name = "d"};

    push(& q, & a, 3);
    // the next one is the strange line: as you can see, it doesn't modify q
    // but commenting it out produces different behaviour
    Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
    push(& q, & b, 5);
    push(& q, & c, 4);

    char s[5];
    Node* n;
    for (int i = 1; i <= 65; ++i) {
        sprintf(s, "%d", i);
        n = malloc(sizeof *n);
        n->name = strdup(s);
        push(& q, n, i);
    }

    Node* current = NULL;
    while ((current = pop(& q))) {
        printf("%s ", current->name);
    }
    return 0;
}

and the push function:

void push(Pqueue* q, Node* item, int priority) {
    if (q->size >= q->capacity) {
        if (DEBUG)
            fprintf(stderr, "Reallocating bigger queue from capacity %d\n",
                    q->capacity);
        q->capacity *= 2;
        Pqueue_elem* newElems = realloc(q->elems,
                                        q->capacity * sizeof *newElems);
        check(newElems, "a bigger elems array");
        q->elems = newElems;
    }

    // append at the end, then find its correct place and move it there
    int idx = ++q->size, p;
    while ((p = PARENT(idx)) && priority > q->elems[p].priority) {
        q->elems[idx] = q->elems[p];
        idx = p;
    }
    // after exiting the while, idx is at the right place for the element
    q->elems[idx].data = item;
    q->elems[idx].priority = priority;
}

The pqueue_new function:

Pqueue pqueue_new(unsigned int size) {
    if (size < 4)
        size = 4;
    Pqueue* q = malloc(sizeof *q);
    check(q, "a new queue.");
    q->capacity = size;
    q->elems = malloc(q->capacity * sizeof *(q->elems));
    check(q->elems, "queue's elements");

    return *q;
}
Ciprian Tomoiagă
  • 3,773
  • 4
  • 41
  • 65
  • The "irrelevant part" that you removed may very well be the most relevant: I assume this is where you write stuff into `newElems`, right? – Sergey Kalinichenko Apr 25 '14 at 14:16
  • Yes, and no. I don't use newElements outside that if, only q->elems (which has been assigned to newElems). I will include it – Ciprian Tomoiagă Apr 25 '14 at 14:20
  • 1
    @CiprianTomoiaga It does not matter if you write to the block pointed to by `newElems` or `q->elems`, it's the same block. The error is likely to be there. Use valdrind to find it faster. – Sergey Kalinichenko Apr 25 '14 at 14:24
  • Why dereference `q` when you return from `pqueue_new`, and then almost everywhere use the address of it `main()`? It'd be easier to stick to `*q` in `main`, potentially avoiding awkward bugs. –  Apr 25 '14 at 14:30
  • @EliasVanOotegem Because realloc returns a pointer to the new memory. It doesn't modify the passed pointer. As demonstrated [here](http://www.cplusplus.com/reference/cstdlib/realloc/?kw=realloc) I multiply `q->capacity` by 2 in `push`, right before `realloc` call and use that new value. No, it is recommended to check for NULL before assigning to `q.elems` and only assign if it's valid. Right? – Ciprian Tomoiagă Apr 25 '14 at 14:32
  • Are you sure your `q->elems` are correctly initialized? I see some assignment loop in `push`, but I can't tell what it exactly is assigning to. And `check` could assign, but I can't tell that either. Otherwise, `q->elems` are only allocated and not assigned, which I can imagine lead to crashes. –  Apr 25 '14 at 14:35
  • @CiprianTomoiaga: I wasn't thinking when I posted my first comment. `realloc` does indeed point to the _new_ location in memory. That's why passing the unchanged pointer (ie: possibly invalid memory address) after the `realloc` call to the `push` function is causing trouble. Checking for `NULL` should be the _norm_, not merely _recommended_. If you have a NULL pointer, you have a serious problem, and shouldn't try to blunder along. – Elias Van Ootegem Apr 25 '14 at 14:35
  • @Evert: returning a pointer from `pqueue_new` is one option, passing it a pointer to a stack variable is another way. The bonus of the latter is that it still allows you to use stack memory, which is faster – Elias Van Ootegem Apr 25 '14 at 14:53
  • Your `pqueue_new()` function leaks memory. – EOF Apr 25 '14 at 15:58
  • @EliasVanOotegem What do you mean "it is faster", referring to stack memory? Should I ask a new question for that? – Ciprian Tomoiagă Apr 25 '14 at 19:27
  • @CiprianTomoiaga: Stack memory is, of course, faster than the heap. I've added an example of both approaches (Returning a pointer, or passing a stack pointer) the second approach (passing a pointer) is faster, but less flexible. PS: EOF is -sort of- right: `pqueue_new` possibly leaks memory. – Elias Van Ootegem Apr 27 '14 at 09:51

2 Answers2

2

realloc will change the amount of memory that is allocated, if needed. It is also free to move the data to another place in memory if that's more efficient (avoiding memory fragmentation).
The function, then, returns a new pointer to the new location in memory where your data is hiding. You're calling realloc, and allocating (probably) four times as much memory as before, so it's very likely that that allocated memory is situated elsewhere in memory.

In your comment, you said realloc works like free + malloc. Well, in some cases it can behave similarly, however: realloc and free are different functions, that do different tasks. Both are functions that manage the dynamic memory, so yes, obviously there are similarities, and in the case of realloc, sometimes they can seem to be doing the same thing, however: As I explained here, realloc and free are fundamentally different functions

However, by not assigning the return value of realloc to q.elems, you're left with a pointer to a memory address that is no longer valid. The rest of your program can, and probably does, exhibit signs of undefined behaviour, then.

Unless you show some more code, I suspect this will take care of the problem:

//change:
Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
//to
q.elems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);

Or better yet, check for NULL pointers:

Pqueue_elem* newElems = realloc(q.elems, 4 * q.capacity * sizeof *newElems);
if (newElems == NULL)
    exit( EXIT_FAILURE );// + fprintf(stderr, "Fatal error...");
q.elems = newElems;//<-- assign new pointer!

Looking at your pqueue_new function, I would suggest a different approach. Have it return the pointer to Pqueue. You're working with a piece of dynamic memory, treat it accordingly, and have your code reflect that all the way through:

Pqueue * pqueue_new(size_t size)
{//size_t makes more sense
    if (size < 4)
        size = 4;
    Pqueue* q = malloc(sizeof *q);
    check(q, "a new queue.");
    q->capacity = size;
    q->elems = malloc(q->capacity * sizeof *(q->elems));
    check(q->elems, "queue's elements");

    return q;
}

Alternatively, pass the function a pointer to a stack variable:

void pqueue_new(Pqueue *q, size_t size)
{
    if (q == NULL)
    {
        fprintf(stderr, "pqueue_new does not do NULL pointers, I'm not Chuck Norris");
        return;//or exit
    }
    if (size < 4)
        size = 4;
    check(q, "a new queue.");
    q->capacity = size;
    q->elems = malloc(q->capacity * sizeof *(q->elems));
    check(q->elems, "queue's elements");
}
//call like so:
int main ( void )
{
    Pqueue q;
    pqueue_new(&q, 3);
}

Those would be the more common approaches.

Community
  • 1
  • 1
Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149
  • thank you! Ouh, so the call to `realloc` in main basically `free`s the q.elems (by moving it to a new zone, pointed at by `newElems`). I will use that – Ciprian Tomoiagă Apr 25 '14 at 14:35
  • @CiprianTomoiaga: sort of. It really depends, `realloc`, `free`, `malloc` and `calloc` don't just blindly allocate memory. They map & track where what amount of memory is allocated, and manage the memory so it doesn't become too fragmented. the allocated blocks can be padded and, if you call `realloc` moved to another section of the heap, if that means the memory is managed more efficiently. My guess would be that this is what's happened in your case – Elias Van Ootegem Apr 25 '14 at 14:38
  • Depending on the library/OS used, no, realloc will *not* free q.elems, and you should not have to reassign `q.elems = newElems`. In case of Linux, please read the [manual page](http://linux.die.net/man/3/realloc) ("The realloc() function changes the size of the memory block pointed to by ptr to size bytes. The contents will be unchanged in the range from the start of the region up to the minimum of the old and new sizes.") –  Apr 25 '14 at 14:39
  • 1
    @Evert: The question wasn't if `realloc` can perform the task of `free`, but how it works internally. if the pointer points to data @ `0x00123`, that data can, after a call to `realloc`, be moved to `0x00A44`, that's all I was saying. – Elias Van Ootegem Apr 25 '14 at 14:41
  • Sorry, my bad: the return value can indeed point to a different area than the input pointer. –  Apr 25 '14 at 14:44
0

Thank you all for the suggestions! I wouldn't have solved it without them,

The strange behaviour was caused by an off by one error. I was reallocating the queue only when q->size >= q->capacity, but since q was indexed from 0, it meant that before realloc I was writing in a forbidden location (q->elems[q->size]), which messed everything up.

Ciprian Tomoiagă
  • 3,773
  • 4
  • 41
  • 65