Suspicious memory management implications

Question

So in an attempt to solidify my understanding of pointers and memory management in general in C, I decided to do a basic implementation of a linked list.

I'm very suspicious of certain things in my code though, and was wondering if I could get some clarification. First, the code:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

//Doubly-linked lists brah
typedef struct ll_int {
    struct ll_int* prev;
    int value;
    struct ll_int* next;
} ll_int_t;

ll_int_t* newNode(int value) {
    //Allocate the memory
    ll_int_t* new = (ll_int_t*) malloc(sizeof(ll_int_t));

    //If it was not successfully created, exit
    if (new == NULL) {
        fprintf(stderr, "Unable to allocate memory for new LL node\n");
        exit(EXIT_FAILURE);
    }

    //Set up the node
    new->value = value;
    new->prev = NULL;
    new->next = NULL;
    return new;
}

void addElement(ll_int_t* head, int value) {
    //Check if the head is in fact the end of the list
    if (head->next == NULL) {
        ll_int_t* new = newNode(value);
        head->next = new;
        new->prev = head;
    } else { 
        addElement(head->next, value);
    }
}

int lenList(ll_int_t* head) {
    int i = 1;
    ll_int_t* curr = head;
    while ((curr = curr->next) != NULL) i++;
    free(curr);
    return i;
}

void delHead(ll_int_t** list) {
    if (*list != NULL && lenList(*list) > 1) {
        ll_int_t* dead = *list;
        *list = (*list)->next;
        (*list)->prev = NULL;
        free(dead);
    }
}

bool delElementAt(ll_int_t** list, int pos) {
    if (pos == 0) {
        delHead(list);
        return true;
    } else {
        //TODO: Implement
    }
}

void printLL(ll_int_t** list) {
    //We could be clever and traverse back to the start if we're not
    //given the head of the list... but that's /effort/
    if ((*list)->prev != NULL) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    int i = 0;
    ll_int_t* curr = *list;
    do {
        printf("(%d): %d\n", i++, curr->value);
    } while((curr = curr->next) != NULL); //sneaky
    free(curr);
}

int main (int argc, char** argv) {
    ll_int_t* head = newNode(5);
    addElement(head, 10);
    printf("lenList: %d\n", lenList(head));
    addElement(head, 15);
    printf("lenList: %d\n", lenList(head));
    addElement(head, 20);
    printf("lenList: %d\n", lenList(head));
    printLL(&head);
    delElementAt(&head, 0);
    printLL(&head);
    return(EXIT_SUCCESS);
}

Let me pick out some sore points:

So initially, my addElement function looked like:

void addElement(ll_int_t* head, int value) {
    //Attempt to create the new node
    ll_int_t* new = newNode(value);

    //Check if the head is in fact the end of the list
    if (head->next == NULL) {
        head->next = new;
        new->prev = head;
    } else { 
        ll_int_t* curr = head->next;
        while (curr->next != NULL) curr = curr->next;
        curr->next = new;
        new->prev = curr;
    }
}

But I was worried about that ll_int_t* curr because it felt like an unnecessary copy. So I changed it to use a tail recursion:

void addElement(ll_int_t* head, int value) {
    //Check if the head is in fact the end of the list
    if (head->next == NULL) {
        ll_int_t* new = newNode(value);
        head->next = new;
        new->prev = head;
    } else { 
        addElement(head->next, value);
    }
}

Which made me feel happier, but I guess my question is, have I actually achieved anything here?

Furthermore, are my free()s in lenList and printLL actually necessary, or does the function scope take care of this for me?

You achieved a waste of stack-space, possible stack overflow for huge lists and hard-to-debug code. In real-world C, always prefer the iterative solution over the recursive one. — , Jul 13 '17 at 11:34
So this is something I was worried about; did my previous version have any implications I may be unaware of? — Thanasi Poulos, Jul 13 '17 at 11:37
To answer your question about the `free()`s. Think about it this way. You iterate the pointer `curr` until it's `NULL`. So you actually call `free(NULL)` which according to 7.20.3.2 results in no action at all. — MrPaulch, Jul 13 '17 at 11:37
You're also free'ing memory that you're printing, meaning you just destroyed the link list every time you print it. Do you mean to do that? — CDahn, Jul 13 '17 at 11:38
@CDahn right, I was trying to avoid a memory leak, but how might I do that without the tail recursion? Also, if I was destroying the list, how could I print it twice? — Thanasi Poulos, Jul 13 '17 at 11:39
recursion is not permitted in many industry standards. As @Felix Palmen mentioned try to find another solution without recursion (just iterate through the list). — 0___________, Jul 13 '17 at 11:45
@PeterJ, what? What industry standard says not to use recursion? Tail recursion is a widely accepted methodology, and compilers specifically optimize for it... — CDahn, Jul 13 '17 at 11:47
@capncoolio, sorry, I deleted my comment, I was incorrect, no memory leak there. — CDahn, Jul 13 '17 at 11:48
@CDahn Misra, NASA and many others - why? Read their documents. You will learn why. — 0___________, Jul 13 '17 at 11:50
@CDahn can you explain briefly? To me it seems like curr is a pointer to the same place as head->next... Which I suppose means that by assigning it to new->prev all I'm really doing is pointing it to the same place as curr ended up pointing to, (after the loop) but then it falls out of scope or something? — Thanasi Poulos, Jul 13 '17 at 11:51
@CDahn tail recursion is trivially transformable to iteration, and that's the optimization a compiler would do. So why rely on it instead of avoiding recursion in the first place? — , Jul 13 '17 at 11:53
In printLL, you don't need to free the memory because you're not allocating it there. `curr` is just an alias to the element you're pointing to, so you don't need to deallocate it, scope will do that for you. The list is printing twice, probably by accident. — CDahn, Jul 13 '17 at 11:56
@CDahn okay, yeah I'm with you. It's printing twice because I called `printLL` twice though — Thanasi Poulos, Jul 13 '17 at 11:57
@FelixPalmen, at the risk of sounding like a snob, there's lots[1](https://stackoverflow.com/questions/2185532/why-should-recursion-be-preferred-over-iteration) of[2](https://stackoverflow.com/questions/660337/recursion-vs-loops) reasons[3](https://softwareengineering.stackexchange.com/questions/182314/recursion-or-while-loops) to use recursion over loops, typically when veteran developers understand the tradeoff that code readability leads to better long term maintenance. A Google search will enlighten. — CDahn, Jul 13 '17 at 12:04
@PeterJ, with respect to car manufacturers (MISRA) and rocket scientists (NASA), academia is perfectly okay with recursion. Specialized industries may be creating hardware that optimize for certain types of algorithms and code constructs, but that hardly sets the bar for the rest of software engineering. Sorry. — CDahn, Jul 13 '17 at 12:09
@CDahn-- you must be sure that your recursive algorithm is tail-recursive to start with, and there is always the possibility that future code modifications could break the tail-recursion, possibly leading to problems. — ad absurdum, Jul 13 '17 at 12:14
@DavidBowling, I agree, writing software is sometimes a challenging profession that requires a high degree of talent and expertise to get correct! If it were otherwise, any schmuck with a laptop could do it! — CDahn, Jul 13 '17 at 12:16
@CDahn-- sadly, production code should probably be written so that any schmuck with a laptop can maintain it ;) — ad absurdum, Jul 13 '17 at 12:17
@DavidBowling, haha, well said. I guess that's why we have such bloated, awful software products now-a-days. Long gone are the days when I only needed [1 MB](http://astroblog.cosmobc.com/2010/03/27/did-you-know-the-space-shuttle-runs-on-only-one-megabyte-of-ram/) of RAM to get it done. — CDahn, Jul 13 '17 at 12:21
@CDahn with respect - but if someone says _"writing software is sometimes a challenging profession that requires a high degree of talent and expertise to get correct"_ he is on the highway to get into a troubles. Smart programmer chooses safer and more effective way, than better looking or requiring less lines. Stack overflow problems are very difficult to trace, test & debug. System with limited resources (like embedded ones) are only one of the examples. Misra standards are used not only in the automotive industry but in many others which have adopted it as well established and tested. — 0___________, Jul 13 '17 at 12:39
Tail recursion (to not be stack expensive) relays on the compiler, and is not under 100% control of the programmer. — 0___________, Jul 13 '17 at 12:44
@CDahn I know my ways around the industry and I appreciate nice algorithms, functional programming style (free of side-effects) and the like. Still, for these kind of things, C is the wrong language. A well-written iterative algorithm in C is safe, easy to read and maintain and of course well-understood by C programmers, because it's the idiomatic way to do things. — , Jul 13 '17 at 12:58
@PeterJ, I guess my final point is that yes, I agree, embedded systems are specialized cases that require specialized design principles. The OP wasn't asking about embedded systems, and future beginner readers may not understand why you're saying "recursion is bad." I agree, advanced software development is hard to get right, decades of software engineering research can attest to this, and even MISRA can't account [for every possible problem](http://www.safetyresearch.net/blog/articles/toyota-unintended-acceleration-and-big-bowl-%E2%80%9Cspaghetti%E2%80%9D-code). — CDahn, Jul 13 '17 at 13:35
@FelixPalmen, At the end of the day, some of us still remember that software development used to have a component of art to it, and understand that sometimes form is just as important as function. If recursion helps the form without sacrificing function, it should be encouraged. I remember thinking the same as you when I was first learning to program in college. Many years later, I realize how close minded and short sighted that was. I would encourage you to open your mind and allow the bright light of Code enter your soul. — CDahn, Jul 13 '17 at 13:39
I recommend that you avoid using variable names that are C++ keywords like `new`. This will avoid headaches if ever you have to move to C++. — Paul Floyd, Jul 13 '17 at 14:42
@CDahn I don't *remember* the "art" of programming, I practice it a lot. Still nothing you say could convince me recursion is something to consider **in C** if you don't have to. With tail-recursion, you don't have to. C just isn't designed for this. There are other languages that are. It's the wrong approach trying to do "*something clever*" just for the sake of it. — , Jul 13 '17 at 16:44
In the future, if you want the community to review some code that you think is working, you'll have better luck on http://codereview.stackexchange.com — user229044, Jul 16 '17 at 16:38

score 4 · Answer 1 · answered Jul 13 '17 at 11:58

If you haven't used the valgrind tool before, this is a great opportunity to learn how to debug memory leaks with confidence. Go install valgrind. It will tell you if you have memory leaks.

[~]$ valgrind --leak-check=full ./test
==12880== Memcheck, a memory error detector
==12880== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12880== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==12880== Command: ./test
==12880== 
lenList: 2
lenList: 3
lenList: 4
(0): 5
(1): 10
(2): 15
(3): 20
(0): 10
(1): 15
(2): 20
==12880== 
==12880== HEAP SUMMARY:
==12880==     in use at exit: 72 bytes in 3 blocks
==12880==   total heap usage: 4 allocs, 1 frees, 96 bytes allocated
==12880== 
==12880== 72 (24 direct, 48 indirect) bytes in 1 blocks are definitely lost in loss record 3 of 3
==12880==    at 0x4C27BE3: malloc (vg_replace_malloc.c:299)
==12880==    by 0x4006F1: newNode (test.c:14)
==12880==    by 0x400764: addElement (test.c:31)
==12880==    by 0x40094B: main (test.c:89)
==12880== 
==12880== LEAK SUMMARY:
==12880==    definitely lost: 24 bytes in 1 blocks
==12880==    indirectly lost: 48 bytes in 2 blocks
==12880==      possibly lost: 0 bytes in 0 blocks
==12880==    still reachable: 0 bytes in 0 blocks
==12880==         suppressed: 0 bytes in 0 blocks
==12880== 
==12880== For counts of detected and suppressed errors, rerun with: -v
==12880== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

I've seen this before but it's always confused me. I'll have to learn how to use it I guess! — Thanasi Poulos, Jul 13 '17 at 12:08
@capncoolio, yes, `valgrind` is an absolutely essential tool for C development. Never release anything written in C without running it through `valgrind` first to determine if you have any accidental leaks. — CDahn, Jul 13 '17 at 12:10
`valgrind` is telling me I have a leak in newNode; I presume that's due to the fact I'm `malloc`ing inside the function and passing a pointer out; but how can that be helped? — Thanasi Poulos, Jul 13 '17 at 12:23
@capncoolio, everything that is allocated must be deallocated. So you need a delHead for each addElement that you do. — CDahn, Jul 13 '17 at 12:31
Yeah, I just worked that out as you commented this. Thanks for the help! — Thanasi Poulos, Jul 13 '17 at 12:31

score 2 · Accepted Answer · answered Jul 13 '17 at 12:44

Right now your biggest problem isn't leaks, it's free-spam. Only after that is fixed will you have leaks, and then you can fix that in the correct location.

There are two places in your code you should not being destroying nodes in your linked list, and one place that you should, but is inconveniently missing. The former two can be better detected by simply using const-ness where appropriate. For example:

int lenList(ll_int_t* head) {
    int i = 1;
    ll_int_t* curr = head;
    while ((curr = curr->next) != NULL) i++;
    free(curr); // ????????
    return i;
}

This is a simple (albeit inefficient) length computation function. The list itself should not be modified in any way. Therefore, the input pointer to be pointer-to-const.

int lenList(const ll_int_t *head) {
    int i = 1;
    ll_int_t* curr = head; // Warning: const-ness disqualified
    while ((curr = curr->next) != NULL) i++;
    free(curr);
    return i;
}

Now when you try to compile this you'll get at least a warning telling you the const-ness of head has been discarded when assigning to a non-const pointer (curr). Compiled while treating warnings as errors (which you should always do) will thus red-flag this. And fyi, this function can be significantly simplified to just this:

int lenList(const ll_int_t *head)
{
    int i = 0;
    for (; head; ++i, head = head->next);
    return i;
}

Next, the printLL function, which is just more of the same.

void printLL(ll_int_t **list) {
    //We could be clever and traverse back to the start if we're not
    //given the head of the list... but that's /effort/
    if ((*list)->prev != NULL) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    int i = 0;
    ll_int_t* curr = *list;
    do {
        printf("(%d): %d\n", i++, curr->value);
    } while((curr = curr->next) != NULL); //sneaky
    free(curr); // ????????????
}

This is the identical problem as before, and everything mentioned there applies here, including simplification. Further, this function needlessly passes the head pointer by address, which is ultimately pointless. Just like before, a simple pointer-to-const is all that is needed, and it can be freely modified without affecting the caller:

void printLL(const ll_int_t *list)
{
    if (list && list->prev) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    for (int i=0; list; list = list->next)
        printf("(%d): %d\n", i++, list->value);
}

With those out of the way, you need a freeList function that, given a head by address to a properly construct list, will properly destroy it. One way this can be done is:

void freeList(ll_int_t **head)
{
    if ((*head) && (*head)->prev) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    while (*head)
    {
        void *p = *head;
        *head = (*head)->next;
        free(p);
    }
}

This will come in handy when you actually want to destroy an entire list.

Next, the implementation of delHead vs delElementAt. You're doing them backwards. The former should be implemented with the latter, not the other way around. For example:

bool delElementAt(ll_int_t** list, int pos)
{
    bool res = false;
    ll_int_t* prev = NULL;

    while (*list && pos--)
    {
        prev = *list;
        list = &(*list)->next;
    }

    if (*list)
    {
        void *p = *list;
        *list = (*list)->next;
        (*list)->prev = prev;
        free(p);
        res = true;
    }
    return res;
}

void delHead(ll_int_t** list)
{
    delElementAt(list, 0);
}

Finally, the addElement function, which appears to be just appending to the list. There are better ways to manage this if you contain your linked list in its own structure, which I leave to you as an exercise. Regardless, this is a place where passing the head pointer by address is proper, as you can then initialize an empty (and thus null) list properly:

void addElement(ll_int_t **head, int value)
{
    ll_int_t *prev = NULL;
    while (*head)
    {
        prev = *head;
        head = &(*head)->next;
    }

    *head = newNode(value);
    (*head)->prev = prev;
}

Putting It Together

The final code appears below, including the addendum to free the list on exit. This should pass valgrind leak-free:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

//Doubly-linked lists brah
typedef struct ll_int {
    struct ll_int* prev;
    int value;
    struct ll_int* next;
} ll_int_t;

ll_int_t* newNode(int value)
{
    //Allocate the memory
    ll_int_t* new = malloc(sizeof *new);

    //If it was not successfully created, exit
    if (new == NULL) {
        fprintf(stderr, "Unable to allocate memory for new LL node\n");
        exit(EXIT_FAILURE);
    }

    //Set up the node
    new->value = value;
    new->prev = NULL;
    new->next = NULL;
    return new;
}


void addElement(ll_int_t **head, int value)
{
    ll_int_t *prev = NULL;
    while (*head)
    {
        prev = *head;
        head = &(*head)->next;
    }

    *head = newNode(value);
    (*head)->prev = prev;
}


int lenList(const ll_int_t *head)
{
    int i = 0;
    for (; head; ++i, head = head->next);
    return i;
}


bool delElementAt(ll_int_t** list, int pos)
{
    bool res = false;
    ll_int_t* prev = NULL;

    while (*list && pos--)
    {
        prev = *list;
        list = &(*list)->next;
    }

    if (*list)
    {
        void *p = *list;
        *list = (*list)->next;
        (*list)->prev = prev;
        free(p);
        res = true;
    }
    return res;
}


void delHead(ll_int_t** list)
{
    delElementAt(list, 0);
}


void freeList(ll_int_t **head)
{
    if ((*head) && (*head)->prev) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    while (*head)
    {
        void *p = *head;
        *head = (*head)->next;
        free(p);
    }
}


void printLL(const ll_int_t* list)
{
    if (list && list->prev) {
        printf("That is not the head of the Linked List!\n");
        return;
    }

    for (int i=0; list; list = list->next)
        printf("(%d): %d\t%8p : %8p : %8p\n", i++, list->value, list->prev, list, list->next);
}

int main (int argc, char** argv)
{
    ll_int_t* head = NULL;

    addElement(&head, 5);
    addElement(&head, 10);
    printf("lenList: %d\n", lenList(head));

    addElement(&head, 15);
    printf("lenList: %d\n", lenList(head));

    addElement(&head, 20);
    printf("lenList: %d\n", lenList(head));

    printLL(head);
    delElementAt(&head, 1);

    printLL(head);
    delHead(&head);

    printLL(head);
    freeList(&head);

    return(EXIT_SUCCESS);
}

Output (pointer values vary, obviously)

lenList: 2
lenList: 3
lenList: 4
(0): 5       0x0 : 0x600060 : 0x600170
(1): 10 0x600060 : 0x600170 : 0x600180
(2): 15 0x600170 : 0x600180 : 0x600190
(3): 20 0x600180 : 0x600190 :      0x0
(0): 5       0x0 : 0x600060 : 0x600180
(1): 15 0x600060 : 0x600180 : 0x600190
(2): 20 0x600180 : 0x600190 :      0x0
(0): 15      0x0 : 0x600180 : 0x600190
(1): 20 0x600180 : 0x600190 :      0x0

Whoa, thanks so much! Pointer const-ness is something that has escaped me in the past, I'll have to pay a little more attention. In the meantime, I'm going to comb through this and learn as much as I can. It never ceases to amaze me how willing people are to help on this site :) — Thanasi Poulos, Jul 13 '17 at 12:59

Suspicious memory management implications

2 Answers2