1

I think I'm a bit confused with memory management conventions in C.

Let's say we have a struct that dynamically allocate data on the heap. This struct provides _alloc() and _free() functions for allocating and freeing this struct on the heap.

Here's a example with a simple vector struct:

struct vec_t {
  int *items;
  size_t size;
  size_t capacity;
};

struct vec_t *vec_alloc(void)
{
  struct vec_t *vec = malloc(sizeof(struct vec_t));

  vec->items = NULL;
  vec->size = 0;
  vec->capacity = 0;

  return vec;
}

void vec_free(struct vec_t *vec)
{
  free(vec->items);
  free(vec);
}

void vec_push(struct vec_t *vec, int item)
{
  if (vec->size == vec->capacity)
  {
    size_t new_capacity = vec->capacity > 0 ? (vec->capacity + 1) * 2 : 5;
    vec->items = realloc(vec->items, new_capacity * sizeof(int));
  }
  vec->items[vec->size++] = item;
}

Now let's assume we don't use _alloc() or _free() and instead decide to allocate this struct on the stack.

We then indirectly allocate data on the heap via the stack allocated struct (my_vec.items)

void main()
{
  vec_t my_vec;
  vec_push(&my_vec, 8); // allocates memory

  // vec_destroy(&my_vec); // PROBLEM

  return 0;
}

Now we have a problem: we don't want to free the struct (my_vec) as it's on the stack, but we need to free the data allocated by the struct on the heap (my_vec.items).

I believe this is either a design issue or a convention issue, or a mix of both.

I've seen people add some extra functions _init() and _deinit() in addition to _alloc() and _free().

void vec_init(struct vec_t *vec)
{
  vec->items = NULL;
  vec->size = 0;
  vec->capacity = 0;
}

void vec_deinit(struct vec_t *vec)
{
  free(vec->items);
}

Would it make sense to free memory allocated by the struct in _deinit()?

If this approach is corret, am I correct saying that a struct allocated on the stack like this always need to be _init() and _deinit()?

neeh
  • 2,777
  • 3
  • 24
  • 32
  • 3
    The usual solution is to prohibit stack allocation by making `struct vec_t` an incomplete type. If you want to permit stack allocation, then yes, you'll need some sort of `_init` and `_deinit` function to get things ready and clean things up. – Raymond Chen Dec 08 '20 at 20:58
  • @neeh What are _init and _deinit? – Vlad from Moscow Dec 08 '20 at 20:58
  • `vec->items = realloc(vec, ...)` is wrong. Besides, you should never reassign to the pointer you're passing to `realloc`. – Some programmer dude Dec 08 '20 at 20:59
  • @neeh "I believe this is either a design issue or a convention issue, or a mix of both." - There is neither issue. Your question is unclear. – Vlad from Moscow Dec 08 '20 at 21:00
  • @VladfromMoscow _init() and _deinit() added – neeh Dec 08 '20 at 21:05
  • 1
    You need some form of init function, otherwise your heap example uses an uninitialized `my_vec` object and anything can happen when you try to use it. – 1201ProgramAlarm Dec 08 '20 at 21:05
  • @neeh If you will not free the dynamically allocated memory there will be memory leaks. What is the problem?! – Vlad from Moscow Dec 08 '20 at 21:06
  • @VladfromMoscow In the example, `my_vec.items` is leaking. But I cannot call `vec_detroy` as you can see. – neeh Dec 08 '20 at 21:09
  • @neeh Of course you may not. But again what is the problem? Write a separate function like for example clear that sets your vector in the initial state. – Vlad from Moscow Dec 08 '20 at 21:11
  • 2
    Side-note: `vec_push(my_vec, 8);` is wrong for the stack allocated case; you presumably meant `vec_push(&my_vec, 8);` There [are ways to make it work either way](https://stackoverflow.com/a/47086435/364696) using a typedef of a one-element array (this is how GMP works), but it's frowned upon in many cases; `openssl` eventually switched to opaque (mandatory heap) structures for its `BIGNUM` APIs partially because of that complexity of APIs handling either stack or heap. – ShadowRanger Dec 08 '20 at 21:15
  • `vec_push(&my_vec, 8)` still won't work because the members of `my_vec` aren't initialized at that point. – user3386109 Dec 08 '20 at 21:20
  • @user3386109: True. It's correct *if* it's passed a validly initialized `my_vec`; without the `&`, it's always wrong. – ShadowRanger Dec 08 '20 at 21:39
  • @ShadowRanger My bad, I should have explicitly addressed my comment to the OP. It was intended as a response to edit #4, where OP added the `&` (as you suggested), but left the initialization bug. IMO, OP's difficulty in getting that snippet right is a good example why the API shouldn't allow stack allocation. There are ample precedents for heap-only APIs. The `BIGNUM` API that you mentioned is one. The `FILE *` interface in stdio is another. – user3386109 Dec 08 '20 at 22:06

3 Answers3

2

If you're using _init and _deinit functions, yes, you'd want _deinit to free the memory, and yes, vec_init and vec_deinit would be mandatory for stack allocated structs. For this use case, a stack allocated struct could be initialized with vec_t my_vec = {0}; and a vec_init call avoided, but that assumes zeroing produces a validly initialized struct now and forever (if you change vec_init later to make some fields non-zero, users of your library that didn't use vec_init have to update), and it can be confusing when the unavoidable vec_deinit is not paired with a corresponding vec_init.

Note that code need not be so heavily duplicated; _alloc and _free can be implemented in terms of _init and _deinit, keeping the code duplication to a minimum:

struct vec_t *vec_alloc(void)
{
  struct vec_t *vec = malloc(sizeof(struct vec_t));
  if (vec) vec_init(vec);  // Don't try to init if malloc failed
  return vec;
}

void vec_free(struct vec_t *vec)
{
  if (vec) vec_deinit(vec); // Don't try to deinit when passed NULL
  free(vec);
}
ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
1

My personal approach to this is to assume in the design that the structure can and will live on the stack, and write code that work on an already allocated structure. Quick simplified example:

typedef struct vect_t {
   char *data;
   size_t len;
} vec_t;

void vec_set(vec_t *v, void *data, size_t len) {
    v->data = data;
    v->len = len;
}

void vec_clear(vec_t *v) {
    free(v->data);
    vec_set(v, NULL, 0);
}

int vec_resize(vec_t *v, size_t len) {
    void * data = realloc(v->data, len);
    if (!data) { /* out of memory */
        vec_set(v, NULL, 0);
        return ENOMEM;
    }
    vec_set(v, data, len);
    return 0;
}

int stack_example(void) {
    vec_t v;
    int err;
    vec_set(&v, NULL, 0);
    if ((err = vec_resize(&v, 64)) !=0) {
        return err;
    }
    strcpy(v.data, "Hello World");
    vec_clear(&v);
    return 0;
}

void heap_example(void) {
    vec_t *v = malloc(sizeof(vec_t));
    if (v) {
        int err;
        vec_set(v, NULL, 0);
        if ((err = vec_resize(v, 64)) !=0) {
            return err;
        }
        strcpy(v->data, "Hello World");
        vec_clear(v);
        free(v);
   }
}

The advantage of having the structures on the stack is that you have fewer heap allocations (good for performance and fragmentation), but of course that's at the cost of stack size, which may be your limit depending on the environment you're in.

Enno
  • 1,736
  • 17
  • 32
1

You are mixing two concepts: dynamic memory allocation and initialization of an object.

Taking into account this structure declaration

struct vec_t {
  int *items;
  size_t size;
  size_t capacity;
};

nothing says that an object of this type shall be allocated in the heap.

However the object of the type independent on where it is defined shall be initialized. Otherwise you can get undefined behavior.

The reverse operation of initialization is cleaning.

You could declare an object of the type with the automatic storage duration like

struct vec_t v = { .items = NULL, .size = 0, .capacity = 0 };

However such an approach is not flexible. The user has a direct access to the implementation/ Any changes in the structure definition can make this initialization incorrect.

So it is better to provide a general interface for initialization of an object of the type. You could write for example

void vec_init( struct vec_t *v )
{
    v->items = NULL;
    v->size = 0;
    v->capacity = 0;
}  

and

void vec_clear( struct vec_t *vec )
{
    free( v->items );
    v->size = 0;
    v->capacity = 0;
}

In C opposite to for example C++ if you are allocating an object dynamically its initialization (construction) is not called automatically.

So if you want to provide an interface for the dynamic object allocation you need to write one more function as for example

struct vec_t * vec_create( void )
{
    struct vec_t *v = malloc( sizeof( *v ) );

    if ( v != NULL ) vec_init( v );

    return v;
}

In this case you can provide to the user one more function that frees the dynamically allocated object like

void vec_destroy( struct vec_t **v )
{
    free( *v );
    *v = NULL;
};
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335