First, let's straighten out some terminology -
You are declaring all three pointers the exact same way:
struct Object* objn_p ...
The only difference is in how you are
initializing them.
Declarations in C have two major components - a sequence of declaration specifiers followed by a comma-separated list of optionally initialized declarators. C++ declarations are structured fundamentally the same way, but for this answer I will stick with C terminology (since that's the language we're talking about the one I'm more familiar with).
Declaration specifiers include storage class specifiers (auto
, static
, register
, typedef
, etc.), type specifiers (int
, char
, double
, short
, etc.), type qualifiers (const
, volatile
, restrict
), and a few others.
Declarators include the name of the thing being declared, along with information about its pointer-ness, array-ness, and/or function-ness.
Initializers for scalar objects are scalars. Initializers for aggregate objects like arrays, structs, and unions are brace-enclosed lists or, in the case of character arrays, a string literal.
In the declaration
struct Object* obj1_p = &obj_1;
the declaration specifier is struct Object
, the declarator is * obj1_p
, and the initializer is = &obj_1
.
I know the C++ convention for declaring pointer objects is T* p
, but the syntax is actually T (*p)
- the *
operator is always bound to the declarator, not the type specifier. If you write T* p, q;
then only p
is declared as a pointer to T
; q
is declared as an instance of T
. I know why the C++ convention exists, I know the justifications for it, but it does misrepresent how declaration syntax works in both C and C++ and I consider it a mistake to use. Most C programmers will use the T *p
convention instead1.
Here are the basic rules for pointer declarations in C:
Declaration Declarator
Specifier
----------- ----------
T *p; // p is a pointer to T
T *a[N]; // a is an array of pointer to T
T *f(); // f is a function returning a pointer to T
T (*a)[N]; // a is a pointer to an array of T
T (*f)(); // f is a pointer to a function returning T
The rules for const
are:
T const *p; // p points to a const T
const T *p; // same as above
T * const p; // p is a const pointer to T
What differs between your three methods is how you initialize the pointer.
Method 1 is just taking the address of a previously-declared variable of the same type:
struct Object *obj1_p = &obj1; // using the C convention for pointer declarations
Method 2 is taking the address of a compound literal - basically, an anonymous variable:
struct Object *obj2_p = &(struct Object){ .id = 1 };
The only difference between obj1
and the anonymous object is that you can refer to obj1
directly as well as through the pointer:
printf( "%d %d %d", obj1.id, obj1_p->id, (*obj1_p).id );
whereas you can only refer to the anonymous object through the pointer variable
printf( "%d %d", obj2_p->id, (*obj2_p).id );
Method 3 dynamically allocates memory and assigns the address of the resulting object (which may be NULL
if the malloc
call fails).
struct Object *obj3_p = malloc( sizeof( struct Object ) );
The chief difference between this and the other two methods is that the memory allocated by malloc
hangs around until you explicitly free
it, whether the obj3_p
variable goes out of scope or not. If obj1
and the anonymous object are declared within a block and without the static
keyword, then that memory is automatically released when the block containing them exits.
Also, as a side note, why do some people cast malloc, and is it better to do it?
There are two times when you must cast the result of malloc
(and calloc
and realloc
):
- You are compiling the code as C++;
- You are working with an ancient, pre-C89 K&R implementation.
Unlike C, C++ does not allow implicit conversion between void *
(the type returned from malloc
) and other pointer types. You must explicitly cast conversions to or from void *
. Having said that, if you're writing C++ you should not be calling malloc
directly. You should either be using a container that manages memory for you under the hood (std::string
, std::vector
, std::map
, etc.) or you should be using the new
or new []
operators. If you're in the middle of a lift-and-shift from C to C++, it's acceptable to keep the malloc
calls until you can get around to rewriting your memory management code, but ideally C++ code should never use malloc
(or calloc
or realloc
) directly.
In the earliest versions of C, malloc
, calloc
, and realloc
returned char *
, so you had to cast the result if you were assigning it to pointers of different types:
int *p = (int *) malloc( sizeof *p * N );
As someone who wrote K&R C in college, this was a pain in the ass. It was cluttered and a constant source of mistakes. If you changed the type of p
(say from int *
to long *
) you had to repeat that change in multiple places. That created a higher maintenance burden, especially if (as was often the case) the pointer declaration was separated from the malloc
call by other code:
int *p = NULL;
...
p = (int *) malloc( sizeof *p * N );
Prior to C99, you had to declare all variables before any statements (in that block, anyway), so it was common for pointer declarations to be separated from the malloc
call by multiple lines of code. It was really easy to change the type of *p
in the declaration but forget to do it in the assignment later, causing subtle (and sometimes not-so-subtle) runtime errors.
The 1989 standard introduced the void
type and changed the *alloc
functions to return void *
. It also introduced the rule that you could assign void *
values to other pointer types and vice versa without an explicit cast. So you could write a malloc
call as:
int *p = malloc( sizeof *p * N );
or
int *p = NULL;
...
p = malloc( sizeof *p * N );
If you change the type of *p
, you only have to make that change in one place. It's cleaner, it's harder to screw up, etc.
Also, under the C89 standard, casting the result of malloc
could suppress a useful compiler diagnostic if you forgot to include stdlib.h
or otherwise didn't have a declaration for malloc
in scope. But since C99 did away with implicit int
declarations2, that's not really an issue anymore.
However, there are people who prefer to keep the explicit cast for various reasons. Personally, I have found those reasons wanting, and I've accumulated enough scar tissue from bad casts that I prefer to leave them off entirely. 30+ years of writing code has convinced me that when it comes to memory management, simpler is always better. It doesn't hurt to use them; they don't slow down the code or anything like that. But from a readability and maintenance standpoint, casting malloc
is bad juju.
- Whitespace in declarations is only significant in that it separates tokens. The
*
character is a token on its own and not part of any identifier, so you can write any of T *p
, T* p
, T*p
, or T * p
and they will all be parsed as T (*p)
.
- Prior to C99, if the compiler saw a function call without a preceding declaration, it assumed the function returned
int
, so if you forgot to include stdlib.h
the compiler would complain if you tried to assign the result of malloc
to a pointer since an int
can't be implicitly converted to a pointer. However, if you used the cast, then the diagnostic would be suppressed.