Improve my solution to basic C linked list management functions

Question

I would appreciate some help relative to my code solution, which deals with linked list management in C. I'll already declare the only strange thing with my request: I am writing a C++ file, but I am actually mostly leveraging C resources (malloc(), free(), etc.); that said, given the basic code I provide, I am confident no one will have trouble with that.

I want to write a function to add elements to the end of the list and one to delete elements from it, that work in any edge case. Given my desire, the removal function was the one that I struggled the most with, but also the one that made me realize how little I am understanding pointers.

I will now share the code I produced, that should be working fine, but:

It can surely be greatly improved both in terms of clarity and performance
I think that showing it to the community will highlight many of the flaws present in my solution

// The plan is to create a linked list and to be able to add and delete its elements

#include <iostream>
using namespace std; // I can write output lines as cout << "Hi!", rather than std::cout < "Hi!"
#include <cstdlib> // needed for malloc() in C++

struct node {
    int data;
    node* nextPtr; //"struct node* nextPtr;" : This would be the syntax for plain old C: you always have to type the "struct" keyword
};

node* createElement(int data) {
    node* newElemPtr = (node*)malloc(sizeof(node)); // the "(node*)" cast is required by C++, and is not used in C
    newElemPtr->data = data;
    newElemPtr->nextPtr = NULL;
    return newElemPtr;
}

void appendElement(int data, node** head) { // Adds a new node at the end of the list
    // I pass as argument a pointer to pointer (double pointer) to node, so that I can edit the head node
    // if the list is empty, without having to return a new node pointer as head: my function indeed features
    // "void" in its signature
    node* elemPtr = NULL;
    elemPtr = createElement(data); // elemPtr is a pointer to the new node

    if (*head == NULL) {
        *head = elemPtr;
    }
    else {
        node* currPtr = *head; // currPtr is the temporary variable that visits each node of the linked list
        while (currPtr->nextPtr != NULL)
            currPtr = currPtr->nextPtr;
        currPtr->nextPtr = elemPtr; // Set last element's nextPtr to "elem", i.e., a pointer to the new element
    }
};

void removeElement(int data, node** head) { // Remove all the nodes whose data content matches the "data" argument
    int presence_flag = 0; // Flag used to check whether the required data is present at all in the linked list
    if (*head == NULL) {
        return;
    }
    else {
        node* currPtr = *head;
        node* prevPtr = *head;
        while (currPtr != NULL) {
            // This is the case in which I find a node to delete (it matches the "data" query), and it is not the first of the list
            if (data == currPtr->data && currPtr != *head) {
                prevPtr->nextPtr = currPtr->nextPtr; // Link the node ahead of the one to delete with the one behind
                free(currPtr);
                currPtr = prevPtr; // In the next loop, I will resume the analysis from the previous node, which now points to an unvisited one
                presence_flag = 1;
            }
            // This is the case in which I find a node to delete and it is the first of the list
            else if (data == currPtr->data && currPtr == *head) {
                // This is the case in which I have to delete the first node, but the list features other nodes
                if (currPtr->nextPtr != NULL){
                    *head = currPtr->nextPtr; // Move *head forward
                    currPtr = *head; // Do the same with currPtr, in order not to break the while() loop
                    free(prevPtr); // As *head has already been re-assigned, I leverage prevPtr to delete the old *head
                    presence_flag = 1;
                }
                // This is the case in which I have to delete the first and only node of the list
                else {
                    *head = NULL;
                    currPtr = *head;
                    presence_flag = 1;
                }
            }
            // This is the case in which the current node does not match the queried "data" value
            else{
                prevPtr = currPtr; // Update prevPtr
                currPtr = currPtr->nextPtr; // Move currPtr forward
            }
        }
    }
    if (presence_flag == 0)
        cout << "There is not any node with value " << data << " in the linked list.\n\n";

    // Q1: Am I causing any memory leak by using *head == NULL instead of free(*head)?
    // Q2: Should I free() everythin before ending the main(), at least as a good practice?
    // Q3: Is there a way to make this function by not using a double pointer as input and by also keeping "void" as return value?
    //     Of course, it should still work in the tricky edge case of the last element in the list that has to be deleted
};

void printLinkedList(node* head) { // Here I return nothing, so I can freely edit "head" (i.e., there is no need for a temporary pointer)
    if (head == NULL) {
        cout << "The linked list is empty.\n";
    }
    else {
        int elemCounter = 0;
        while (head != NULL) {
            elemCounter += 1;
            cout << "elem N. " << elemCounter << ": data value = " << head->data << "\n"; // head->data is equal to (*head).data
            head = head->nextPtr;
        }
    }
};

int main(int argc, char* argv[])
{
    //cout << "Size of a single node of the list = " << sizeof(node) << "\n";
    // == 16. On a 64 bits machine, an int ("data") requires 4 bytes.
    // The pointer requires 8 bytes; the remaining 4 bytes are padding

    node* head = NULL;

    appendElement(1, &head);
    appendElement(2, &head);
    appendElement(3, &head);
    printLinkedList(head);

    cout << "\nRemoving element with data value = 1...\n\n";
    removeElement(1, &head);
    printLinkedList(head);
    cout << "\nRemoving element with data value = 2...\n\n";
    removeElement(2, &head);
    printLinkedList(head);
    cout << "\nRemoving element with data value = 3...\n\n";
    removeElement(3, &head);
    printLinkedList(head);
    cout << "\nRemoving element with data value = 4...\n\n";
    removeElement(4, &head);
    printLinkedList(head);
    cout << "\nRemoving element with data value = 1...\n\n";
    removeElement(1, &head);
    printLinkedList(head);
    cout << "\nRemoving element with data value = 2...\n\n";
    removeElement(2, &head);
    printLinkedList(head);

    return 0;
}

As you can see from the comments embedded in the code, I have 3 doubts that captured my interest while coding the node removal function:

Q1: Am I causing any memory leak by using *head == NULL instead of free(*head)?
Q2: Should I free() everything before ending the main(), at least as a good practice?
Q3: Is there a way to make this function by not using a double pointer as input and by also keeping "void" as return value? Of course, it should still work in the tricky edge case of the last element in the list that has to be deleted

I hope that featuring these "additional" questions is something reasonable to put here, as maybe someone in the future may have the same doubts I had.

I know there are plenty of ready-to-copy-and-paste solutions for my task, but I think I can really learn this stuff if I see why my precise design choices are not optimal/wrong.

I thank everyone for the time spent reading this.

If your code works but you are seeking advice on how to improve it, then you should probably post your question on [Code Review](https://codereview.stackexchange.com), but be sure to read their [rules of engagement](https://codereview.stackexchange.com/about) first ([On Topic](https://codereview.stackexchange.com/help/on-topic) and [How to Ask](https://codereview.stackexchange.com/help/how-to-ask)). — Jonathan Leffler, Jun 30 '22 at 17:56
As a general rule of thumb, whenever you need to do a C-style cast when programming in C++, you should take that as a sign that you're doing something wrong. Even if you're developing an API that will be used from C you don't have to use the C dynamic memory functions in your C++ code, you can still use `new` and `delete`. — Some programmer dude, Jun 30 '22 at 17:58
And if you're developing a C API then you should probably hide the structures from the C code behind [opaque data structures](https://en.wikipedia.org/wiki/Opaque_data_type). Then you could even use the C++ standard library for your container, and don't have to worry about things like memory management. — Some programmer dude, Jun 30 '22 at 18:00
Since your code is written in C++, even if it presents a C interface, the question should probably be tagged with C++ rather than C. You have not shown the header that declares the API you are providing, either. And, in general, error messages should be written to `cerr` (or perhaps `clog`) rather than `cout`, but a library API should be cautious about writing sny messages anywhere — and about exiting unilaterally (see [How to end C++ code](https://stackoverflow.com/questions/30250934/how-to-end-c-code) — which discusses the (non-)use of `std::exit` in C++ code). — Jonathan Leffler, Jun 30 '22 at 18:13
Thanks for the tips @JonathanLeffler; I will definitely leverage [Code Review](https://codereview.stackexchange.com/) next times, I was not aware about that community at all. I'll change the tag as suggested and deal with `cout`s, even if I must admit that that was already part of my plans :) I'll check the suggested link too — Jetboy, Jul 01 '22 at 13:38
Thanks also to @Someprogrammerdude. As for the header subject, I really do not know how to show it, as I just created a sample Visual Studio 2022 project (C++ console application), and I seem not to have .h files inside the source folder. As for the C++ keywords, I will leverage those from now on in such cases. — Jetboy, Jul 01 '22 at 13:38

Vlad from Moscow · Accepted Answer · 2022-06-30T19:02:59.660

There are many duplicated code. Also the function should not output any message. It is the caller of the function that decides whether to output a message. So the function should have the return type bool if you are considering the program as a C++ program or bool or int if you are considering the program as a C program.

The function removeElement invokes undefined behavior because in its paths of execution you are not always resetting correctly values of the pointers currPtr and prevPtr after deleting a node.

For example after this code snippet

        if (data == currPtr->data && currPtr != *head) {
            prevPtr->nextPtr = currPtr->nextPtr; // Link the node ahead of the one to delete with the one behind
            free(currPtr);
            currPtr = prevPtr; // In the next loop, I will resume the analysis from the previous node, which now points to an unvisited one
            presence_flag = 1;
        }

prevPtr and currPtr will be equal each other.

I would define the function the following way

int removeElement( node **head, int data ) 
{
    int deleted = 0;

    while ( *head )
    {
        if ( ( *head )->data == data )
        {
            deleted = 1;
            node *current = *head;
            *head = ( *head )->next;
            free( current );
        }
        else
        {
            head = &( *head )->next;
        }
    }

    return deleted;
}

As for your question

Q3: Is there a way to make this function by not using a double pointer as input and by also keeping "void" as return value? Of course, it should still work in the tricky edge case of the last element in the list that has to be deleted

then in C you can not achieve this. In C++ you can pass the pointer to the first node by reference. In C passing by reference means passing an object indirectly through a pointer to it. So in C you have to use a double pointer in such a case.

Of course just setting a pointer to NULL without freeing data pointed to by the pointer that was dynamically allocated produces a memory leak. And you should free all the allocated memory then it is not required any more.

Thanks. I had already some doubts relative to my logging implementations indeed. That said, I really do not understand why you say that `currPtr = prevPtr` causes undefined behavior in that situation (or in any case): aren't they actually equal at the function beginning (both `= *head`)? Is that wrong too? I tried to check again my logic in the code snippet you reported but it seems like it works fine. I do not doubt your correction, it is just that I still cannot see the error. I understood the C/C++ portion, and it now makes more sense to me. Thank you — Jetboy, Jul 01 '22 at 13:49
Furthermore, I really do not get how your solution can afford to play with `head` like that: don't you change its value forever while running this function, as you pass a double pointer? — Jetboy, Jul 01 '22 at 15:49
@Jetboy Initially the parameter pointes to the head node pointer. But then it is reassigned to point to another pointer (the data member next) like head = &( *head )->next; So now the parameter head points to the data member next of the head node. the — Vlad from Moscow, Jul 01 '22 at 16:11

Improve my solution to basic C linked list management functions

1 Answers1