6

Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?

For example:

struct person {

    char* name;

};

struct person p = {.name="apple"};

struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";

What is really the difference between the two? When would one approach be used over others?

Amanda
  • 2,013
  • 3
  • 24
  • 57
  • 1
    `sizeof(struct person)` or `sizeof *p_tr` are valid, but `sizeof(person)` is a syntax error. Best practice would be to use `sizeof *p_tr` – William Pursell Feb 02 '20 at 19:14
  • 2
    I suppose the biggest practical difference is that `struct person p = {.name="apple"};` creates a variable that goes out of scope when the enclosing block ends (eg, if this is in a function, the memory becomes invalid when the function returns.) With `struct person *p`, the variable `p` also becomes invalid when the enclosing scope ends, but the data to which it points is still valid. – William Pursell Feb 02 '20 at 19:16
  • @WilliamPursell So if I store the pointer `*ptr` may be in some queue even after the function ends and later try to retrieve it, it will still be there but not the case with `person p`? – Amanda Feb 02 '20 at 19:23
  • Use `malloc` when you need more control over the lifetime of the object than the first version will give you – M.M Feb 02 '20 at 19:29
  • @WilliamPursell: The lifetime of an automatic object ends when execution of its associated block ends. This is different from when execution leaves the scope, as execution may leave the scope via subroutines or interrupts/signals. Lifetime is **when** during program execution **object** exists. Scope is **where** in source text an **identifier** is visible. – Eric Postpischil Feb 02 '20 at 19:52

4 Answers4

5

Having a data structure like;

struct myStruct {
    int a;
    char *b;
};

struct myStruct p;  // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct));  // alternative 2
  • Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).

  • Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).

In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.

printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)

printf("%p\n", q);  // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)

Addressing the second part of your; when to use which:

  • If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
  • If this struct is temporary and you do not need its value after, you do not need to malloc memory.

Usage with an example:

struct myStruct {
    int a;
    char *b;
};

struct myStruct *foo() {
    struct myStruct p;
    p.a = 5;
    return &p; // after this point, it's out of scope; possible warning
}

struct myStruct *bar() {
    struct myStruct *q = malloc(sizeof(struct myStruct));
    q->a = 5;
    return q;
}

int main() {
    struct myStruct *pMain = foo();
    // memory is allocated in foo. p.a was assigned as '5'.
    // a memory address is returned.
    // but be careful!!!
    // memory is susceptible to be overwritten.
    // it is out of your control.

    struct myStruct *qMain = bar();
    // memory is allocated in bar. q->a was assigned as '5'.
    // a memory address is returned.
    // memory is *not* susceptible to be overwritten
    // until you use 'free(qMain);'
}
ssd
  • 2,340
  • 5
  • 19
  • 37
  • 1
    It is preferable to answer C questions using the terminology of the C standard. The memory provided by `malloc` is *allocated*. A *heap* is a particular kind of data structure, and implementations of `malloc` do not necessarily use heaps. Hardware stacks are overwhelmingly used in general-purpose C implementations to implement *automatic* storage, but they also are not required by the C standard, and esoteric implementations (e.g., for special-purpose systems with constrained resources) might not use a hardware stack. – Eric Postpischil Feb 02 '20 at 19:49
  • @EricPostpischil: Yes, you may be right. I'm not that expert in compiler internals. – ssd Feb 02 '20 at 19:54
4

If we assume both examples occur inside a function, then in:

struct person p = {.name="apple"};

the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:

  • You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
  • You are working with a small number of objects at one time.

In:

struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";

the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:

  • The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
  • The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
  • The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.

Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.

If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.

Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
1
struct person p = {.name="apple"};

^This is Automatic allocation for a variable/instance of type person.

struct person* p_tr = malloc(sizeof(person));

^This is dynamic allocation for a variable/instance of type person.

Static memory allocation occurs at Compile Time. Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction

FahimAhmed
  • 497
  • 3
  • 14
  • How would I "use" this difference in the static allocation and dynamic allocation? – Amanda Feb 02 '20 at 19:14
  • What difference does it make? – Amanda Feb 02 '20 at 19:14
  • 1
    You can head over to the explanation at this link, https://stackoverflow.com/a/8385488/11973277. Basically when you know how much memory / variables / instances you require in your programs lifetime, it's better to pre-allocate the memories, because of multiple reasons like, the memory would be pre-allocated in compile time, which would save allocation times that could have been spent in runtime. – FahimAhmed Feb 02 '20 at 19:18
  • 1
    On the other hand, dynamic memory allocation gives the programmer more control over memory management in his/her program's lifetime. You allocate a chunk of memory in runtime, after it's usage is done, you can simply free it, or retain it based on your requirement, and choices. So it gives you more control, but with that you also get more responsibility. – FahimAhmed Feb 02 '20 at 19:21
0

Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.

All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.

The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.

Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.

There are other uses as well.

In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.

Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.

Serge
  • 11,616
  • 3
  • 18
  • 28