8

I normally program in python. To increase performance of my simulations, I am learning C. I have a problem to understand the use of a pointer of a pointer when implementing the append function to a linked list. This is an excerpt of the code from my book (Understanding Pointers in C by Kanetkar).

#include <stdlib.h>
#include <stdio.h>

struct node{
    int data;
    struct node *link;
};

int main(){
    struct node *p; //pointer to node structure
    p = NULL;   //linked list is empty

    append( &p,1);
    return 0;
}

append( struct node **q, int num){
    struct node *temp, *r;  //two pointers to struct node
    temp = *q;

    if(*q == NULL){
        temp = malloc(sizeof(struct node));
        temp -> data = num;
        temp -> link = NULL;
        *q = temp;
    }
    else{
        temp = *q;
        while( temp -> link != NULL)
            temp = temp -> link;
        r = malloc(sizeof(struct node));
        r -> data = num;
        r -> link = NULL;
        temp -> link = r;
    }
}

In this code, I pass the double pointer **q to the append function. I get that this is the adress of the address, i.e. the adress of NULL in this case.

I just don't get why one does it like this. Would it not be valid to remove one * operator from everything in the append() function and simply pass the adress of NULL (i.e. p instead of &p) to the append() function?

I have googled this question. The answers are either too hard to understand (since I'm just a C beginner) or too plain. I'm thankful for any hints, comments or links where I can read up about this.

AstroCB
  • 12,337
  • 20
  • 57
  • 73
seb
  • 2,251
  • 9
  • 30
  • 44

7 Answers7

17

When you pass things to functions in C, whether its variables or pointers, it's a copy of the original.

Quick example:

#include <stdio.h>
void change(char *in)
{
    // in here is just a copy of the original pointer.
    // In other words: It's a pointer pointing to "A" in our main case 
    in = "B";
    // We made our local copy point to something else, but did _not_ change what the original pointer points to.
}
void really_change(char **in)
{
    // We get a pointer-to-a-pointer copy. This one can give us the address to the original pointer.
    // We now know where the original pointer is, we can make _that one_ point to something else.
    *in = "B";
}
int main(int argc, char *argv[])
{
    char *a = "A";
    change(a);
    printf("%s\n", a); /* Will print A */
    really_change(&a);
    printf("%s\n", a); /* Will print B */
    return 0;
}

So the first function call to change() gets passed a copy of a pointer to an address. When we do in = "B" we only change the copy of the pointer we got passed.

In the second function call, the really_change(), we get passed a copy of a pointer-to-a-pointer. This pointer contains the address to our original pointer and voila, we can now reference the original pointer and change what the original pointer should point to.

Hopefully it explains it somewhat more :)

Jite
  • 4,250
  • 1
  • 17
  • 18
6

First Its not "the address of the address." Its the address of a pointer variable. Ex: if you pass the address of an int variable n that contains zero, you're not passing the address of zero; you're passing the address of a variable (in this case an int variable, in your case a pointer-variable). Variables have addresses in memory. The parameter in this case is the address of a variable that happens to be a pointer-variable, the head of your list.

Regarding why one does this? Simple. All variables in C (arrays via pointer-decay not withstanding) are passed by value. If you want to modify something by reference (address), then the "value" you need to pass must be an address and the formal parameter that receives it must be a pointer-type. In short, you make the "value" being passed a memory address rather than just a basic scaler value. The function then uses this (via a formal pointer parameter) to store data accordingly. Think of it as saying "put that thing I wanted at "this" memory address."

As a short example, suppose you wanted to run through a file, appending every character within into a forward linked list of nodes. You would not use an append-method like the one you have (see The Painter's Algorithm for why). See if you can follow this code, which uses a pointer-to-pointer, but no function call.

typedef struct node
{
    char ch;
    struct node *next;
} node;


node *loadFile(const char *fname)
{
    node *head = NULL, **next = &head;
    FILE *fp = fopen(fname, "r");
    if (fp)
    {
        int ch;
        while ((ch = fgetc(fp)) != EOF)
        {
            node *p = malloc(sizeof(*p));
            p->ch = ch;
            *next = p;
            next = &p->next;
        }
        *next = NULL;
        fclose(fp);
    }
    return head;
}

Stare at that awhile and see if you can understand how the pointer-to-pointer next is used to always populate the next linked node to be added to the list, starting with the head-node.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • Great explanation about value/reference. – Jite Mar 06 '13 at 10:29
  • ok, I'm at it. it's gonna take me more than a minute. thanks for the great example in advance though! – seb Mar 06 '13 at 10:43
  • Well, there used to be one. 32 minutes ago, and I seemed to have failed to refresh (or I still had it memorised, which is relatively improbable ;-) – wildplasser Mar 06 '13 at 11:11
  • @wildplasser yeah my winbox at work with IE just blows at auto-refreshing these pages. chrome and safari both on my macbook air are *much* more fluid. updating SO at work sometimes drives me nuts. – WhozCraig Mar 06 '13 at 11:16
3

You need to do it that way for the function to be able to allocate the memory. Simplifying the code:

main()
{
  void *p;
  p = NULL;
  funcA(&p);

  int i;
  i = 0;
  funcB(&i);
}

funcA(void **q)
{
  *q = malloc(sizeof(void*)*10);
}

funcB(int *j)
{
  *j = 1;
}

This code is done that way so the subfunction funcA can allocate the p pointer. First of all, consider void* p as if it where int i. What you do by doing p = NULL is similar in a way as int i = 0. Now if you pass &i you do not pass the address of 0, you pass the address of i. Same thing happens with &p you pass the address of the pointer.

Now in funcA, you want to make the allocation, so you use malloc but if you would do q = malloc(... and q would have been void* q in the main function p would not have been allocated. Why? Think of funcB, j holds the address of i, if you want to modify i you would then do *j = 1, because if you would do j = 1 then you would have made j point to another memory region instead of i. It is the same with q for funcA. think of it as <type of p>* q it is a pointer to the type of p which is void*, but in case of funcB it is a int. Now you want to modify the address p is pointing at, it means that you do not want to modify the address q is pointing at, you want to modify the pointing address at what q is pointing at, aka *q or p.

If it isn't yet clear. Try to think of boxes. I have drawn a quick example with funcA of the boxes involved. Each box has a name (within the box), this box is in the virtual memory of the process at an arbitrary address, and each box contain a value. In this view we are in the state where the funcA(&p) has been called and the malloc will be done.

enter image description here

Huygens
  • 617
  • 1
  • 13
  • 26
2

Hey why you are thinking it in this way,think when someone is passing structure to append function,in that case whole structure struct node{int data; struct node *link; }; in your case will get copied on stack frame of append function,so better to pass address of structure pointer so as to get just 4 bytes copied onto stack.

Hitesh Menghani
  • 977
  • 1
  • 6
  • 14
2

You don't need the if/else; in both cases you link the new node to a pointer that was NULL before the operation. This could be the root node, or the ->next node of the last node in the chain. Both are pointers to struct node, and you need a pointer to these pointers in order to assign to them.

void append( struct node **q, int num){

    while (*q){ q = &(*q)->link; }

    *q = malloc(sizeof **q);
    (*q)->data = num;
    (*q)->link = NULL;

}

Why would anyone do this? Basically because it is shorter, it uses only one loop and no additional conditions, uses no additional variables, and it can be proven to be correct. There should of course a test be added for the result of malloc, which would need an addtional condition.

wildplasser
  • 43,142
  • 8
  • 66
  • 109
  • Null pointers are not guaranteed by the standard to have an all-zeroos representation, IIRC. (ints and ISO-floats do) – wildplasser Mar 06 '13 at 10:40
  • That is true, and I stand corrected (actually I'm sitting right now). NULL is defined to be 0, and will "compare" correctly to a null-pointer, but they are *not* synonymous (and if you're ever worked on an AS/400, you know this to be true. *amazing* pointer architecture). I shall drop my comment. Thanks for putting me right. – WhozCraig Mar 06 '13 at 10:42
  • No, NULL is not guaranteed to be all-zeroos; in the source, a 0 constant cast to the type of a pointer is interpreted as a NULL pointer, and the compiler will use the platform's representation for a NULL pointer. (which *in most cases* is all zeroos) – wildplasser Mar 06 '13 at 10:47
1

In essence, as Jite & the others said is correct.

Whenever you want to have a change applied to a data structure in C (change performed by another function), you need to pass a "reference" to this data structure for the change to persist beyond the change() function completes. This is what happens in Python too, you pass references to Objects unless you explicitly make a copy. In C you have to specify what you want to do precisely. To simplify it even more, it's either:

type data_struct

change(data_struct) => here's a copy of my data_struct, make your change but I do not care in the caller function about the change you apply

or

change(&data_struct) => here's the address (a "reference" to) of my data_struct, apply your change, the caller function will see this change after it is applied.

Now, depending of what the original "type" is, you might have * or **. Nevertheless, keep in mind that there is a limit to how many "indirections" you can have, not sure weather it is system or compiler determined, if anyone has an answer to that I'm a taker. I never had more than 3 indirections.

swappy
  • 108
  • 1
  • 6
0

I think reason is as follows :

struct node *p; //pointer to node structure p = NULL;

Above snippet when written in main block , means value of pointer p is NULL so it does not point to anything in memory . So we pass address of pointer p in order to create a new node and assign the address of new node to the value of pointer p .

*(&p) == *q == temp;

By doing *q == temp; we achieve our goal of assigning some value to pointer p which was originally pointing nowhere .

Deepak Kumar
  • 109
  • 1
  • 1
  • 12