1

I have only been learning C for less than a week (with my knowledge of C++ and other languages to help) and I am confused on pointers and their ways of being declared.

Below, I use a simple struct named Object:

struct Object { int id; }; 

Do the below methods for creating a pointer do the same thing just in a different way, or no?

struct  Object obj1 = { .id = 1 };
struct  Object* obj1_p = &obj1; // method 1 of getting a pointer

// The same, just in a compound literal?
struct  Object* obj2_p = &(struct Object){ .id = 1 }; // method 2 of getting a pointer

// Is this the same, other than being uninitialized?
struct Object* obj3_p = malloc(sizeof(struct Object)); // method 2 of getting a pointer

Is there a time when you only can use one method?

Also, as a side note, why do some people cast malloc, and is it better to do it?

// malloc is casted to object:
struct Object* obj3_p = (Object*)malloc(sizeof(struct Object));
Some Goose
  • 43
  • 3
  • 1
    *"Also, as a side note, why do some people cast malloc, and is it better to do it?"* - If they insist on compiling with a C++ compiler, it's necessary. [Otherwise it's just bad.](https://stackoverflow.com/a/605858/6699433) But I would say that compiling your C code with a C++ compiler [does not make much sense either.](https://softwareengineering.stackexchange.com/q/412386/283695) – klutt Mar 17 '21 at 18:57
  • @Some Goose This question "Is there a time when you only can use one method?" does not make a sense or it is too broad. – Vlad from Moscow Mar 17 '21 at 19:01
  • @klutt There is nothing bad to cast malloc. this is a wrong opinion that appeared when old compilers were used. – Vlad from Moscow Mar 17 '21 at 19:04
  • 1
    @VladfromMoscow Is there anything good about it? – klutt Mar 17 '21 at 19:10
  • @klutt Read the reference that you provided. There is said that in case when the header is not included using casting can invoke an error. But modern C compilers and the modern C Standard do not allow to use functions without their declarations. An error or warning message will be generated in any case. – Vlad from Moscow Mar 17 '21 at 19:13
  • @VladfromMoscow So I ask you again, what is good about casting malloc? Any other thing than being able to compile with a C++ compiler? (Which often would require small other modifications too) – klutt Mar 17 '21 at 19:16
  • @klutt Another case. Consider statement p = malloc( sizeof( *p ) ); For readers of the code this statement in fact says nothing because the type of the variable p is unknown. You need to scroll the source code to find the declaration of the variable p. Using casting of malloc makes the code more readable and clear. – Vlad from Moscow Mar 17 '21 at 19:17
  • @VladfromMoscow Yes, I've seen that argument a lot of times, and to me it just looks like clutching at straws. If leaving out the cast really causes confusion, then there's something else that's wrong and a cast will not fix that. – klutt Mar 17 '21 at 19:26
  • @VladfromMoscow another thing since old compilers is the ability to define variables close to the point of use so perhaps `sometype *p = malloc(sizeof *p);` is clear (and less clutter). – Weather Vane Mar 17 '21 at 19:27
  • 1
    @VladfromMoscow Let's say you have declared `int *p, *q;` somewhere. Would you also do `p = (int*) q;` for clarity? – klutt Mar 17 '21 at 19:28
  • @klutt Casting allows to avoid an error. As a void pointer can be assigned to a pointer of any type you can make a wrong assignment. For example if you have a pointer declared like int ( *p )[10]; then somewhere without casting you can write for example p = malloc( sizeof( int[3][3] ) ); And the compiler will not help to find the error. – Vlad from Moscow Mar 17 '21 at 19:30
  • 1
    @VladfromMoscow That would be completely avoided by just doing `p = malloc(sizeof *p)` istead of `p = malloc( sizeof( int[3][3] ) )` – klutt Mar 17 '21 at 19:31
  • @klutt Usually when pointers used in such situation like p = q it is more clear their purposes from the context when you are using malloc. – Vlad from Moscow Mar 17 '21 at 19:33
  • If the purpose of `p = malloc(sizeof *p)` isn't clear, then the code simply is not clear. I'd say that most of the times this would be solved in a MUCH better way with a better name for `p`. – klutt Mar 17 '21 at 19:35
  • @klutt These records p = malloc(sizeof *p) and p = malloc( sizeof( int[3][3] ) ) are not the same. Using the pointer p you can allocate a two dimensional array with any number of rows. That is if you will write p = malloc( sizeof( *p ) ); when you will allocate in fact a one-dimensional array instead of a two-dimensional array. – Vlad from Moscow Mar 17 '21 at 19:36
  • @VladfromMoscow I thought the whole point of your example was that you did not mean to write `p = malloc( sizeof( int[3][3] ) )`. Did I misunderstand something? – klutt Mar 17 '21 at 19:38
  • @klutt Yes. If there was written for example p = ( int ( * )[3] )malloc( sizeof( int[3][3] ) ); when the compiler would issue an error because the type of the pointer p is not int ( * )[3] – Vlad from Moscow Mar 17 '21 at 19:40
  • @klutt Imagine a situation that you have two pointers p and q of the types int ( * )[3] and int ( * )[4]. declared somewhere in the code. And in some place of the program you need to allocate for these pointers a memory. You can write by mistake p = malloc( sizeof( int[2][4] ) ); q = malloc( sizeof( int[2][3] ) ); And the code will compile. Then it will be very difficult to find the reason of an invalid behavior of the program. – Vlad from Moscow Mar 17 '21 at 19:43
  • @VladfromMoscow I cannot really see when it would not be suitable to use the form `p = malloc(sizeof *p * size)` – klutt Mar 17 '21 at 19:46
  • @klutt Sometimes the expression in malloc is not straightforward and can refer to other array or pointer. What about p = malloc( sizeof( a ) );? – Vlad from Moscow Mar 17 '21 at 19:48
  • @VladfromMoscow To me that just sounds outright dangerous. But sure, if you for some reason do not use the form `p = malloc(sizeof *p)` and have another expression, I guess a cast could have some use. I'm not convinced, but I don't feel I could argue against it without thinking it through first. I'd love to see some code that shows the usage of that however. But sure, I could stretch my statement to that if you use the construct `p = malloc(sizeof *p)` with or without an extra size parameter, then a cast is completely unnecessary. – klutt Mar 17 '21 at 19:56

4 Answers4

2

These two “methods” do exactly the same thing. And as you said, the second one is just a compound literal.

struct  Object obj1 = { .id = 1 };
struct  Object *obj1_p = &obj1;

// The same, just in a compound literal?
struct  Object *obj2_p = &(struct Object){ .id = 1 };

This allocates enough memory for struct Object without initializing it. And no you don't need to cast it, because malloc returns void *, which is automatically and safely promoted to any other pointer. But if you do, you should cast it to struct Object* instead of Object*.

struct Object *obj3_p = (struct Object*) malloc(sizeof(struct Object));

That looks very bulky though... My preferred way of doing it is this:

struct Object *obj3_p = malloc(sizeof *obj3_p);
Andy Sukowski-Bang
  • 1,402
  • 7
  • 20
2

I wrote this piece of code, hope it helps you to better understand some features of pointers:

#include <stdio.h>
#include <stdlib.h>

struct Object { int id; };

struct Object *getObjectBold() {
    struct Object* obj2_p = &(struct Object) { .id = 2 };

    return obj2_p; // UB: Returns the address of a local object (the compound literal).
}

struct Object *getObject() {
    struct Object* obj3_p = malloc(sizeof(*obj3_p)); // Better way of calling malloc than using sizeof(struct Object).
    obj3_p->id = 3; // You don't need to do this.

    return obj3_p; // This needs to be freed later on!
}

int main(void) {
    struct Object obj1 = { .id = 1 };
    struct Object* obj1_p = &obj1;
    
    printf("obj1.id = %d\n", obj1_p->id); 
    obj1_p->id = 10; // You can change values using the pointer
    printf("obj1.id = %d\n", obj1_p->id); 

    // The only different thing with this case is that you don't
    // "lose" your object when setting the pointer to NULL 
    // (although you can only access it through the object, not through the pointer).

    obj1_p = NULL;
    printf("obj1.id = %d\n", obj1_p->id); // This won't work (undefined behaviour).
    printf("obj1.id = %d\n", obj1.id); // This will.


    struct Object* obj2_p = &(struct Object) { .id = 1 };
    obj2_p->id = 2; // You can change the id
    printf("obj2.id = %d\n", obj2_p->id);

    // If you make this pointer point to another address, you "lose" your object.
    obj2_p = NULL;
    printf("obj2.id = %d", obj2_p->id); // This won't work at all (undefined behaviour).


    // Both of these pointers point to objects in the stack, so, for example,
    // they don't work when returning from a function.
    obj2_p = getObjectBold();
    obj2_p->id = 20; // This won't work (undefined behaviour).
    printf("obj2.id = %d\n", obj2_p->id); // This works if you don't dereference the pointer.


    // The third case is not the same as the other two, since you are allocating memory on the heap.
    // THIS is a time where you can only use one of these three methods.
    struct Object *obj3_p = getObject(); // This works!
    printf("obj3.id = %d\n", obj3_p->id);
    obj3_p->id = 30; // This works now.
    printf("obj3.id = %d\n", obj3_p->id);

    free(obj3_p); // You need to do this if you don't want memory leaks.

    return 0;
}

This is the output when commenting out undefined behaviour:

obj1.id = 1
obj1.id = 10
obj1.id = 10
obj2.id = 2
obj2.id = 2
obj3.id = 3
obj3.id = 30

I'd recommend you to check out these links, they turned out to be pretty helpful for me:

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
Dante Culaciati
  • 158
  • 3
  • 14
1

There are two distinct topics in your question.

struct  Object* obj1_p = .......; 
^^^^^^^^^^^^^^^^^^^^^^   ^^^^^^^^
pointer object           initialization
definition
  1. Pointer variable definition

You can define the pointer variable only one way:

type *objectname;
  1. Initialization assigns the value to the pointer variable. This value should reference the valid object of the same type as the pointer or valid memory of the size not smaller than the pointer type. The difference in your examples is how the referenced object is created.

Is there a time when you only can use one method?

That only depends on the program logic. You only must remember about the scope of the underlying object to avoid dereferencing objects which does not exist outside the particular scope:

struct Object *valid1(void)   //valid
{
    struct Object* obj3_p = malloc(sizeof(*obj3_p)); 

    return obj3_p;
}

struct  Object obj1 = { .id = 1 };
struct Object *valid2(void)   // valid
{
    struct Object* obj3_p = &obj1; 

    return obj3_p;
}

struct Object *invalid1(void)   // invalid
{
    struct  Object obj1 = { .id = 1 };
    struct Object* obj3_p = &obj1; 

    return obj3_p;
}

struct Object *invalid2(void)   // invalid
{
    struct Object* obj3_p = &(struct Object){ .id = 1 };


    return obj3_p;
}

Also, as a side note, why do some people cast malloc, and is it better to do it?

It is considered as bad practice as it silences the warning if there is no prototype of the malloc. Better do not cast. Modern compilers and recent C standard disallow the use of the functions without prototypes

It is better to use sizeof(object) instead of sizeof(type) as if you change the type of the object you need to change oll of the occurrences of the sizeof(type) in your program. It very easy to miss some and get very hard to discover errors.

0___________
  • 60,014
  • 4
  • 34
  • 74
0

First, let's straighten out some terminology -

You are declaring all three pointers the exact same way:

struct Object* objn_p ...
The only difference is in how you are initializing them.

Declarations in C have two major components - a sequence of declaration specifiers followed by a comma-separated list of optionally initialized declarators. C++ declarations are structured fundamentally the same way, but for this answer I will stick with C terminology (since that's the language we're talking about the one I'm more familiar with).

Declaration specifiers include storage class specifiers (auto, static, register, typedef, etc.), type specifiers (int, char, double, short, etc.), type qualifiers (const, volatile, restrict), and a few others.

Declarators include the name of the thing being declared, along with information about its pointer-ness, array-ness, and/or function-ness.

Initializers for scalar objects are scalars. Initializers for aggregate objects like arrays, structs, and unions are brace-enclosed lists or, in the case of character arrays, a string literal.

In the declaration

struct Object* obj1_p = &obj_1;

the declaration specifier is struct Object, the declarator is * obj1_p, and the initializer is = &obj_1.

I know the C++ convention for declaring pointer objects is T* p, but the syntax is actually T (*p) - the * operator is always bound to the declarator, not the type specifier. If you write T* p, q; then only p is declared as a pointer to T; q is declared as an instance of T. I know why the C++ convention exists, I know the justifications for it, but it does misrepresent how declaration syntax works in both C and C++ and I consider it a mistake to use. Most C programmers will use the T *p convention instead1.

Here are the basic rules for pointer declarations in C:

Declaration Declarator
Specifier
----------- ----------
T                  *p;   // p is a pointer to T
T               *a[N];   // a is an array of pointer to T
T                *f();   // f is a function returning a pointer to T
T             (*a)[N];   // a is a pointer to an array of T
T              (*f)();   // f is a pointer to a function returning T

The rules for const are:

T const *p;  // p points to a const T
const T *p;  // same as above
T * const p; // p is a const pointer to T

What differs between your three methods is how you initialize the pointer.

Method 1 is just taking the address of a previously-declared variable of the same type:

struct Object *obj1_p = &obj1; // using the C convention for pointer declarations

Method 2 is taking the address of a compound literal - basically, an anonymous variable:

struct Object *obj2_p = &(struct Object){ .id = 1 };

The only difference between obj1 and the anonymous object is that you can refer to obj1 directly as well as through the pointer:

printf( "%d %d %d", obj1.id, obj1_p->id, (*obj1_p).id );

whereas you can only refer to the anonymous object through the pointer variable

printf( "%d %d", obj2_p->id, (*obj2_p).id );

Method 3 dynamically allocates memory and assigns the address of the resulting object (which may be NULL if the malloc call fails).

struct Object *obj3_p = malloc( sizeof( struct Object ) );

The chief difference between this and the other two methods is that the memory allocated by malloc hangs around until you explicitly free it, whether the obj3_p variable goes out of scope or not. If obj1 and the anonymous object are declared within a block and without the static keyword, then that memory is automatically released when the block containing them exits.

Also, as a side note, why do some people cast malloc, and is it better to do it?

There are two times when you must cast the result of malloc (and calloc and realloc):

  1. You are compiling the code as C++;
  2. You are working with an ancient, pre-C89 K&R implementation.

Unlike C, C++ does not allow implicit conversion between void * (the type returned from malloc) and other pointer types. You must explicitly cast conversions to or from void *. Having said that, if you're writing C++ you should not be calling malloc directly. You should either be using a container that manages memory for you under the hood (std::string, std::vector, std::map, etc.) or you should be using the new or new [] operators. If you're in the middle of a lift-and-shift from C to C++, it's acceptable to keep the malloc calls until you can get around to rewriting your memory management code, but ideally C++ code should never use malloc (or calloc or realloc) directly.

In the earliest versions of C, malloc, calloc, and realloc returned char *, so you had to cast the result if you were assigning it to pointers of different types:

int *p = (int *) malloc( sizeof *p * N );

As someone who wrote K&R C in college, this was a pain in the ass. It was cluttered and a constant source of mistakes. If you changed the type of p (say from int * to long *) you had to repeat that change in multiple places. That created a higher maintenance burden, especially if (as was often the case) the pointer declaration was separated from the malloc call by other code:

int *p = NULL;
...
p = (int *) malloc( sizeof *p * N );

Prior to C99, you had to declare all variables before any statements (in that block, anyway), so it was common for pointer declarations to be separated from the malloc call by multiple lines of code. It was really easy to change the type of *p in the declaration but forget to do it in the assignment later, causing subtle (and sometimes not-so-subtle) runtime errors.

The 1989 standard introduced the void type and changed the *alloc functions to return void *. It also introduced the rule that you could assign void * values to other pointer types and vice versa without an explicit cast. So you could write a malloc call as:

int *p = malloc( sizeof *p * N );

or

int *p = NULL;
...
p = malloc( sizeof *p * N );

If you change the type of *p, you only have to make that change in one place. It's cleaner, it's harder to screw up, etc.

Also, under the C89 standard, casting the result of malloc could suppress a useful compiler diagnostic if you forgot to include stdlib.h or otherwise didn't have a declaration for malloc in scope. But since C99 did away with implicit int declarations2, that's not really an issue anymore.

However, there are people who prefer to keep the explicit cast for various reasons. Personally, I have found those reasons wanting, and I've accumulated enough scar tissue from bad casts that I prefer to leave them off entirely. 30+ years of writing code has convinced me that when it comes to memory management, simpler is always better. It doesn't hurt to use them; they don't slow down the code or anything like that. But from a readability and maintenance standpoint, casting malloc is bad juju.


  1. Whitespace in declarations is only significant in that it separates tokens. The * character is a token on its own and not part of any identifier, so you can write any of T *p, T* p, T*p, or T * p and they will all be parsed as T (*p).
  2. Prior to C99, if the compiler saw a function call without a preceding declaration, it assumed the function returned int, so if you forgot to include stdlib.h the compiler would complain if you tried to assign the result of malloc to a pointer since an int can't be implicitly converted to a pointer. However, if you used the cast, then the diagnostic would be suppressed.
John Bode
  • 119,563
  • 19
  • 122
  • 198