0

I tried to find the proper way to dynamically allocate memory for a structure that looks like this:

typedef struct myThread {
    unsigned int threadId;
    char threadPriority;
    unsigned int timeSlice;
    sem_t threadSem;
} myThread;

I remember, but I'm not sure, that, in some school paper, I saw that the proper way to allocate memory for this case is this one:

myThread *node = (myThread *)malloc(sizeof(myThread *));

I tried that and it worked, but I didn't understand why. Sizeof pointer for my architecture is 8 bytes, so by writing the instruction above, I'm allocating 8 bytes of continuous memory, not enough to hold the information needed in my structure. So I tried to allocate 1 byte of memory, like this:

myThread *node = (myThread *)malloc(1);

And it's still working. I tried to find the answer for this behavior but I didn't succeed. Why is this working? Besides that, I have few more questions:

  • Which is the right way to dynamically allocate memory for a structure?
  • Is that cast necessary?
  • How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.

Thank you

Later Edit: Thank you for your answers, but I realized that I didn't explained myself properly. By saying that it works, I mean that it worked to store values in those specific fields of the structure and use them later. I tested that by filling up the fields and printing them afterwards. I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.

  • `myThread *node = (myThread *)malloc(sizeof(myThread *));` must be `myThread *node = (myThread *)malloc(sizeof(myThread));` and the cast is useless so finally `myThread *node = malloc(sizeof(myThread));` – bruno May 03 '20 at 10:13
  • `a.b` is the syntax for accessing field named `b` in a struct object `a` – M.M May 03 '20 at 10:14
  • 5
    `T *p = malloc( sizeof *p );` is a good pattern – M.M May 03 '20 at 10:14
  • 1
    _Why is this working?_ Undefined behaviour, anything can happen (including "code working") – David Ranieri May 03 '20 at 10:15
  • Either `sizeof(myThread)` or `sizeof(*node)`. The latter is more "robust" in the sense that it allows you to change the type of `*node` without changing this assignment. – goodvibration May 03 '20 at 10:21
  • It's "working" because you are overwriting/using memory that you do not have allocated and you're lucky that it's not overwriting something that your application needs to operate. `malloc` usually provides allocations in multiples of `pagesize`, or it may provide a part of an existing page that you can use, it depends on your system malloc implementation. Run your program under `valgrind` and it will compain bitterly that you are using memory that was not allocated and/or initialized. – Geoffrey May 03 '20 at 10:25
  • Andrei I edited my answer – bruno May 03 '20 at 10:40
  • and I still wait for the reason of my DV ... supposing there is a good reason to do ... – bruno May 03 '20 at 10:41
  • Thank you for your answers, @bruno, I tried to upvote your answer but it won't work because I don't have enough credits or something like that – Andrei Deatcu May 03 '20 at 10:50
  • @AndreiDeatcu You don't need any reputation to select an answer to your question as the "accepted", most helpful answer. This is the checkmarks you see right under the vote total. – aschepler May 03 '20 at 10:55

4 Answers4

3

The below works in that they allocate memory - yet the wrong size.

myThread *node = (myThread *)malloc(sizeof(myThread *));// wrong size,s/b sizeof(myThread) 
myThread *node = (myThread *)malloc(1);                 // wrong size 

Why is this working?

When code attempts to save data to that address, the wrong size may or may not become apparent. It is undefined behavior (UB).

C is coding without training wheels. When code has UB like not allocating enough memory and using it, it does not have to fail, it might fail, now or later or next Tuesday.

myThread *node = (myThread *)malloc(1);  // too small
node->timeSlice = 42;  // undefined behavior

Which is the right way to dynamically allocate memory for a structure? @M.M

The below is easy to code right, review and maintain.

p = malloc(sizeof *p);  //no cast, no type involved.
// or
number_of_elements = 1;
p = malloc(sizeof *p * number_of_elements);

// Robust code does error checking looking for out-of-memory
if (p == NULL) {
  Handle_error();
}

Is that cast necessary?

No. Do I cast the result of malloc?

How is the structure stored in memory?

Each member followed by potential padding. It is implementation dependent.

unsigned int
maybe some padding
char
maybe some padding
unsigned int
maybe some padding
sem_t
maybe some padding

I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.

OP is looking for a reason why it works.

Perhaps memory allocation is done in chunks of 64-bytes or something exceeding sizeof *p so allocating 1 had same effect as sizeof *p.

Perhaps the later memory area now corrupted by code's use of scant allocation will manifest itself later.

Perhaps the allocater is a malevolent beast toying with OP, only to wipe out the hard drive next April 1. (Nefarious code often takes advantage of UB to infect systems - this is not so far-fetched)

Its all UB. Anything may happen.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

Since memory allocation in C is quite error prone I always define macro functions NEW and NEW_ARRAY as in the example below. This makes memory allocation more safe and succinct.

#include <semaphore.h> /*POSIX*/
#include <stdio.h>
#include <stdlib.h>

#define NEW_ARRAY(ptr, n) \
    { \
        (ptr) = malloc((sizeof (ptr)[0]) * (n)); \
        if ((ptr) == NULL) { \
            fprintf(stderr, "error: Memory exhausted\n"); \
            exit(EXIT_FAILURE); \
        } \
    }

#define NEW(ptr) NEW_ARRAY((ptr), 1)

typedef struct myThread {
    unsigned int threadId;
    char threadPriority;
    unsigned int timeSlice;
    sem_t threadSem;
} myThread;

int main(void)
{
    myThread *node;
    myThread **nodes;
    int nodesLen = 100;

    NEW(node);
    NEW_ARRAY(nodes, nodesLen);
    /*...*/
    free(nodes);
    free(node);
    return 0;
}
August Karlstrom
  • 10,773
  • 7
  • 38
  • 60
  • I would not call using macros for allocation "clearly expressed" but I guess its own preference :) – Tony Tannous May 03 '20 at 12:01
  • @TonyTannous The point is that you don't have to fiddle with the sizeof operator each time you allocate dynamic memory. – August Karlstrom May 03 '20 at 12:33
  • "more safe" --> A corner weakness to this approach is when the size is more complex that an `n`. Consider `int r; int c; NEW_ARRAY(nodes, r*c);` performs the `r*c` multiplication using `int` math. With large `r,c` this may overflow whereas `sizeof (ptr)[0] * r *c` might not. `r,c,n` as `size_t` avoids this as well as re-ordering macro's multiplication `sizeof (ptr)[0] * n` or having an explicit `NEW_ARRAY_2D(nodes, r, c);` with `sizeof (ptr)[0] * (r) * (c)`. – chux - Reinstate Monica May 03 '20 at 15:35
  • @chux-ReinstateMonica Thanks for pointing this out. I have now corrected the order of the multiplication in the *malloc* parameter. – August Karlstrom May 03 '20 at 19:41
  • @AugustKarlstrom Er. the order of `(sizeof (ptr)[0]) * (n)` vs. `(n) * (sizeof (ptr)[0])` makes not difference as the `()` forces the `int multiplication. The idea of `sizeof (ptr)[0] * n`, with the `()` would cause `NEW_ARRAY(nodes, r*c)` to multiply the ``sizeof (ptr)[0] * r` first. In general I see no win-win approach with in the corner case and thus prefer code vs. macro. Yet I do like ideas about the macro. – chux - Reinstate Monica May 03 '20 at 21:43
  • @chux-ReinstateMonica You're right. However, in practice signed overflow gives the correct result anyway, e.g. if an int is 4 bytes then 2000000000 * 2 * 1UL equals 4000000000 even though 2000000000 * 2 overflows. – August Karlstrom May 04 '20 at 11:27
  • @AugustKarlstrom The UB of `2000000000 * 2 * 1UL` may equal 4000000000 when the size of `int` and `unsigned long` are the same, yet vastly different when `int` is 4 bytes and `unsigned long` is 8 - something quite common these days. So I disagree with "in practice signed overflow gives the correct result anyway". IAC, best to do size math with `size_t`. – chux - Reinstate Monica May 04 '20 at 14:57
  • @chux-ReinstateMonica You are right again. Thanks for your patience. We can conclude that the caller will have to make sure that the calculation of the number of elements does not overflow. We could also add assertions that *n* > 0 and *n* <= ((size_t) -1) / sizeof (*ptr*)[0]. – August Karlstrom May 05 '20 at 09:02
  • Note: `n == 0` is OK. – chux - Reinstate Monica May 05 '20 at 10:38
0

you do not allocate the right size doing

myThread *node = (myThread *)malloc(sizeof(myThread *));

the right way can be for instance

myThread *node = (myThread *)malloc(sizeof(myThread));

and the cast is useless so finally

myThread *node = malloc(sizeof(myThread));

or as said in remarks to your question

myThread *node = malloc(sizeof(*node));

The reason is you allocate a myThread not a pointer to, so the size to allocate is the size of myThread

If you allocate sizeof(myThread *) that means you want a myThread ** rather than a myThread *

I know that (*node).threadId is equivalent to node->threadI

yes, -> dereference while . does not

Having myThread node; to access the field threadId you do node.threadId, but having a pointer to you need to deference whatever the way


Later Edit: ...

Not allocating enough when you access out of the allocated block the behavior is undefined, that means anything can happen, including nothing bad visible immediately

bruno
  • 32,421
  • 7
  • 25
  • 37
0

malloc reserves memory for you to use.

When you attempt to use more memory than you requested, several results are possible, including:

  • Your program accesses memory it should not, but nothing breaks.
  • Your program accesses memory it should not, and this damages other data that your program needs, so your program fails.
  • Your program attempts to access memory that is not mapped in its virtual address space, and a trap is caused.
  • Optimization by the compiler transforms your program in an unexpected way, and strange errors occur.

Thus, it would not be surprising either that your program appears to work when you fail to allocate enough memory or that your program breaks when you fail to allocate enough memory.

Which is the right way to dynamically allocate memory for a structure?

Good code is myThread *node = malloc(sizeof *node);.

Is that cast necessary?

No, not in C.

How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.

The structure is stored in memory as a sequence of bytes, as all objects in C are. You do not need to do any byte or pointer calculations because the compiler does it for you. When you write node->timeSlice, for example, the compiler takes the pointer node, adds the offset to the member timeSlice, and uses the result to access the memory where the member timeSlice is stored.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312