Understanding the purpose of pointers

Question

I'm reading a book on data structures and having difficulty grasping the concept of pointers. Let me preface this by saying that I don't have a lot of experience with C. But here goes....

If I do the following:

 int num = 5;
 int *ptrNum;

 ptrNum = &num;

It is my understanding that the pointer reserves enough memory for a 32 bit int along with the memory required for the actual pointer although its value is simply the memory address of the variable.

What is the purpose of doing this if the same amount of memory is reserved? Why would I use the pointer instead of the variable, num? Am I totally off base here?

Pointers don't reserve anything, they merely hold a number that's nothing more than the address of another value. — James, Jul 14 '13 at 18:03
Coming up with a single contrived example does not explain the entire situation. Many things are simply not possible without the use of pointers, and consider this; what if this were not an int, but instead a massive array? Would you want to copy the entire array to pass it around? What if a function needed to mutate its input? — Ed S., Jul 14 '13 at 18:05
My book specifically says that they occupy space in memory equal to the data type that it points to. — nullByteMe, Jul 14 '13 at 18:22
Please identify your book and quote what it says. A `char *`, for example, does not occupy the same space as a `char` (normally 4 or 8 bytes for the pointer, versus 1 byte — by definition — for the `char`). If it really says that `sizeof(T *) == sizeof(T)` for all types `T`, then the book is completely and utterly wrong, and it would be best to abandon it now (because if they blew that, what else did they also screw up?). That's a little unlikely though; it is more likely, I'm afraid, that you misconstrued what the book says. — Jonathan Leffler, Jul 14 '13 at 18:33
I'm sorry, I meant to say that the book seems to indicate that a pointer occupies the size equal to the data type along with the size of the actual pointer. Ill quote something specific. Also it's very possible that I misconstrued something. That's exactly why I'm here trying to clear it up! — nullByteMe, Jul 14 '13 at 18:46
You all were correct. The context of the chapter was with reference to using pointers with arrays. — nullByteMe, Jul 14 '13 at 19:06

Jonathan Leffler · Accepted Answer · 2013-07-14T18:17:59.420

You use pointers in situations where a value won't work. In your example, you're correct; there's no benefit. The archtetypal border-line useful example is a swap function:

void swap_int(int *i1, int *i2)
{
    int t1 = *i1;
    *i1 = *i2;
    *i2 = t1;
}

Calling sequence:

int main(void)
{
    int v1 = 0;
    int v2 = 31;
    printf("v1 = %d; v2 = %d\n", v1, v2);
    swap_int(&v1, &v2);
    printf("v1 = %d; v2 = %d\n", v1, v2);
    return 0;
}

If you write that without using pointers — like this:

void swap_int(int i1, int i2)
{
    int t1 = i1;
    i1 = i2;
    i2 = t1;
}

int main(void)
{
    int v1 = 0;
    int v2 = 31;
    printf("v1 = %d; v2 = %d\n", v1, v2);
    swap_int(v1, v2);
    printf("v1 = %d; v2 = %d\n", v1, v2);
    return 0;
}

then you simply swap two local variables in the function without affecting the values in the calling function. Using pointers, you can affect the variables in the calling function.

See also:

scanf()-family of functions
strcpy() et al

It is my understanding that the pointer reserves enough memory for a 32 bit int along with the memory required for the actual pointer although its value is simply the memory address of the variable.

What you appear to be describing is as if:

int *p1;

does the same job as:

int _Anonymous;
int *p1 = &_Anonymous;

It doesn't; this is C. Creating p1 allocates enough space for the pointer. As first written, it doesn't initialize it, so it points to an indeterminate location (or no location). It (the pointer) needs to be initialized before it is used. Hence:

int  i1 = 37;
int *p1 = &i1;

But the allocation of p1 only reserves enough space for a pointer (normally, 32-bits for a 32-bit compilation, 64-bits for a 64-bit compilation); you have to allocate the space it points at separately, and you have to initialize the pointer. Another way of initializing pointers is with dynamically allocated memory:

int *p2 = malloc(1000 * sizeof(*p2));

if (p2 != 0)
{
    ...use p2 as an array of 1000 integers...
    free(p2);
}

Have you covered structures yet? If not, examples covering structures, such as trees or linked lists, won't help. However, once you have covered structures too, you'll be able to use trees or linked lists:

struct list
{
    int data;
    struct list *next;
};

struct tree
{
    int data;
    struct tree *l_child;
    struct tree *r_child;
};

Such structures rely heavily on pointers to connect entries correctly.

Whoaaa that's a huge answer , sir , a +1 will prove too little for you :) — 0decimal0, Jul 14 '13 at 18:25
And you can add a case when a function with `void fn(int* receivedArray)` is more optimal than `void fn(int[] receivedArray)` for `int arr[1000]; fn(arr);` — Suvarna Pattayil, Jul 15 '13 at 11:45

huseyin tugrul buyukisik · Answer 2 · 2013-07-14T18:17:36.327

How would add an element to a dynamic list? By creating a new array each time?

You just add pointer to next element instead and link the previous cell's next pointer to it.

Without pointers, you are constrained to order of arrays and alignment of variables. With pointers, you can select any address in the allocated area to have any alignment of you like, you can have list elements pointing to and from any area you allocated.

So, pointers give you more freedom while needing only 32 or 64 bit space per pointer.

score 2 · Answer 3 · answered Jul 14 '13 at 18:12

A couple of the other answers focus on taking the address of a variable and storing it in a pointer. That's only one use for pointers. An entirely different use for pointers is to point to dynamically allocated storage, and for structuring that storage.

For example, suppose you want to read in a file and work on it in memory. But, you don't know how big the file is ahead of time. You could put an arbitrary upper limit in your code:

#define MAX_FILE_SIZE (640 * 1024)  /* 640K should be large enough for anyone */

char data[ MAX_FILE_SIZE ];

That wastes memory for smaller files, and isn't large enough for larger files. A better approach would be to actually allocate what you need. For example:

FILE *f = fopen("myfile", "rb");
off_t len;
char *data;

fseek(f, 0, SEEK_END);  /* go to the end of the file */
len = ftell(f);         /* get the actual file size */
fseek(f, 0, SEEK_SET);  /* rewind to the beginning  */

data = malloc( len );   /* Allocate just as much as you need */

Another major use of pointers is to structure data, say in lists, or trees, or other fun structures. (Your data structures book will go into many of these.) If you want to reorganize your data, moving pointers is often much cheaper than copying data around. For example, suppose you have a list of these:

struct mystruct
{
     int x[1000];
     int y[1000];
};

That's a lot of data. If you just store that in an array, then sorting that data might be very expensive:

struct mystruct array[1000];

Try qsort on that... it will be very slow.

You can speed this up by instead storing pointers to elements and sorting the pointers. ie.

struct mystruct *array[1000];
int i;
struct mystruct *temp;

/* be sure to allocate the storage, though: */
temp = malloc( 1000 * sizeof( struct mystruct ) );

for (i = 0; i < 1000; i++)
    array[i] = temp + i;

Now if you had to sort those structures, you'd swap pointers in array[] rather than entire structures.

I won't go into the fancier data structures that are better covered by your book. But, I thought I might give you a taste of some other uses for pointers.

score 2 · Answer 4 · answered Jul 15 '13 at 11:41

Pointers serve 3 main purposes in C:

Fake pass-by-reference semantics;
Track dynamically-allocated memory;
Build dynamic data structures.

Fake pass-by-reference semantics: in C, all function arguments are passed by value. Given the following snippet:

void foo( int a, int b ) 
{
  a = 1;
  b = 2;
}

void bar( void )
{
  int x=0, y=1;
  foo( x, y );
  printf( "x = %d, y = %d\n", x, y );
}

The formal parameters a and b in foo are different objects in memory from the actual parameters x and y in bar, so any changes to a and b are not reflected in x and y. The output will be "x = 0, y = 1". If you want foo to alter the values of x and y, you will need to pass pointers to those variables instead:

void foo( int *a, int *b )
{
  *a = 1;
  *b = 2;
}

void bar( void )
{
  int x = 0, y = 1;
  foo( &x, &y );
  printf( "x = %d, y = %d\n", x, y );
}

This time, the formal parameters a and b are pointers to the variables x and y; writing to the expressions *a and *b int foo is equivalent to writing to x and y in bar. Thus, the output is "x = 1, y = 2".

This is how scanf() and scores of other library functions work; they use pointers to reference the actual memory we want to operate on.

Track dynamically allocated memory: The library functions malloc, calloc, and realloc allow us to allocate memory at runtime, and all three return pointers to the allocated memory (as of C89, all three return void *). For example, if we want to allocate an array of int at run time:

int *p = NULL;
size_t numItems;

// get numItems;

p = malloc( sizeof *p * numItems );
if ( p )
{
   // do stuff with p[0] through p[numItems - 1];
}

free( p );

The pointer variable p will contain the address of the newly allocated block of memory large enough to hold numItems integers. We can access that memory by dereferencing p using either the * operator or the [] subscript operator (*(p+i) == p[i]).

So why not just declare an array of size numItems and be done with it? After all, as of C99, you can use a variable-length array, where the size doesn't have to be known until runtime:

// get numItems

int p[numItems];

Three reasons: first, VLA's are not universally supported, and as of the 2011 standard, VLA support is now optional; second, we cannot change the size of the array after it has been declared, whereas we can use realloc to resize the memory block we've allocated; and finally, VLAs are limited both in where they can be used and how large they can be - if you need to allocate a lot of memory at runtime, it's better to do it through malloc/calloc/realloc than VLAs.

A quick note on pointer arithmetic: for any pointer T *p, the expression p+1 will evaluate to the address of the next element of type T, which is not necessariy the address value + 1. For example:

T        sizeof T     Original value of p    p + 1
-        --------     -------------------    -----
char            1                  0x8000    0x8001
int             4                  0x8000    0x8004
double          8                  0x8000    0x8008

Build dynamic data structures: There are times when we want to store data in such a way that makes it easy to insert new elements into a list, or quickly search for a value, or force a specific order of access. There are a number of different data structures used for these purposes, and in almost all cases they use pointers. For example, we can use a binary search tree to organize our data in such a way that searching for a particular value is pretty fast. Each node in the tree has two children, each of which points to the next element in the tree:

struct node {
  T key;
  Q data;
  struct node *left;
  struct node *right;
};

The left and right members point to other nodes in the tree, or NULL if there is no child. Typically, the left child points to a node whose value is somehow "less than" the value of the current node, while the right child points to a node whose value is somehow "greater than" the current node. We can search the tree for a value like so:

int find( struct node *root, T key, Q *data )
{
  int result = 0;

  if ( root == NULL )            // we've reached the bottom of the tree
  {                              // without finding anything
    result = 0;                  
  }
  else if ( root->key == key )   // we've found the element we're looking for   
  {
    *data = root->data;
    result = 1;
  }
  else if ( root->key < key )  
  {
    // The input key is less than the current node's key,
    // so we search the left subtree
    result = find( root->left, key, data );
  }
  else 
  {
    // The input key is greater than the current node's key,
    // so we search the right subtree
    result = find( root->right, key, data );
  }

  return result;
}

Assuming the tree is balanced (that is, the number of elements in the left subtree is equal to the number of elements in the right subtree), then the number of elements checked is around log₂ N, where N is the total number of elements in the tree.

Understanding the purpose of pointers

4 Answers4

Linked