C best practice for using stack memory for incomplete structs

Question

There are times when I want to have a struct which is incomplete (only a single C file knows about its members), so I can define an API for any manipulations and so developers can't easily manipulate it outside the API.

The problem with doing this, its it often means you need a constructor function, which allocates the data, and free it after (using malloc and free).

In some cases this makes very little sense from a memory management perspective, especially if the struct is small, and its allocated and freed a lot.

So I was wondering what might be a portable way to keep the members local to the C source file, and still use stack allocation.

Of course this is C, if someone wants to mess with the struct internals they can, but I would like it to warn or error if possible.

Example, of a simple random number generator (only include new/free methods for brevity).

Header: rnd.h

struct RNG;
typedef struct RNG RNG;

struct RNG *rng_new(unsigned int seed);
void        rng_free(struct RNG *rng);

Source: rnd.c

struct RNG {
    uint64_t X;
    uint64_t Y;
};

RNG *rng_new(unsigned int seed)
{
    RNG *rng = malloc(sizeof(*rng));
    /* example access */
    rng->X = seed;
    rng->Y = 1;
    return rng;
}

void rng_free(RNG *rng)
{
    free(rng);
}

Other source: other.c

#include "rnd.h"
void main(void)
{
    RND *rnd;

    rnd = rnd_new(5);

    /* do something */

    rnd_free(rnd);
}

Possible solutions

I had 2 ideas how it could be done, both feel a bit of a kludge.

Declare the size only (in the header)

Add these defines to the header.

Header: rnd.h

#define RND_SIZE      sizeof(uint64_t[2])
#define RND_STACK_VAR(var) char _##var##_stack[RND_SIZE]; RND *rnd = ((RND *)_##var##_stack)

void rnd_init(RND *rnd, unsigned int seed);

To ensure the sizes are in sync.

Source: rnd.c

#include "rnd.h"

struct RNG {
    uint64_t X;
    uint64_t Y;
};

#define STATIC_ASSERT(expr, msg) \
    extern char STATIC_ASSERTION__##msg[1]; \
    extern char STATIC_ASSERTION__##msg[(expr) ? 1 : 2]

/* ensure header is valid */
STATIC_ASSERT(RND_SIZE == sizeof(RNG))

void rng_init(RNG *rng, unsigned int seed)
{
    rng->X = seed;
    rng->Y = 1;
}

Other source: other.c

#include "rnd.h"

void main(void)
{
    RND_STACK_VAR(rnd);

    rnd_init(rnd, 5);

    /* do something */

    /* stack mem, no need to free */
}

Keeping the size in sync for large struct members may be a hassle, but for small struct's it's not such a problem.

Conditionally hide the `struct` members (in the header)

Using GCC's deprecated attribute, however if there is some more portable way to do this it would be good.

Header: rnd.h

#ifdef RND_C_FILE
#  define RND_HIDE /* don't hide */
#else
#  define RND_HIDE __attribute__((deprecated))
#endif

struct RNG {
    uint64_t X RND_HIDE;
    uint64_t Y RND_HIDE;
};

Source: rnd.c

#define RND_C_FILE
#include "rnd.h"

void main(void)
{
    RND rnd;

    rnd_init(&rnd, 5);

    /* do something */

    /* stack mem, no need to free */
}

This way you can use RND as a regular struct defined on the stack, just not access its members without some warning/error. But its GCC only.

Another very similar question: http://stackoverflow.com/questions/2672015/hiding-members-in-a-c-struct — harmic, Apr 02 '14 at 05:53
@harmic while the Q is similar, mine is from the point-of-view of memory allocation. — ideasman42, Aug 07 '14 at 23:57

score 4 · Answer 1 · answered Apr 02 '14 at 17:13

You can accomplish this in standard C in a manner similar to your first example, though not without going through a great deal of pain to evade aliasing violations.

For now let's just look at how to define the type. In order to keep it fully opaque we'll need to use a VLA that takes the size from a function at runtime. Unlike the size, alignment can't be done dynamically, so we have to maximally align the type instead. I'm using C11's alignment specifiers from stdalign.h, but you can substitute your favorite compiler's alignment extensions if you want. This allows the type to freely change without breaking ABI just like a typical heap-allocated opaque type.

//opaque.h
size_t sizeof_opaque();
#define stack_alloc_opaque(identifier) \
    alignas(alignof(max_align_t)) char (identifier)[sizeof_opaque()]

//opaque.c
struct opaque { ... };
size_t sizeof_opaque(void) { return sizeof(struct opaque); }

Then, to create an instance blackbox of our faux type, the user would use stack_alloc_opaque(blackbox);

Before we can go any further we need to determine how the API is going to be able to interact with this array masquerading as a struct. Presumably we also want our API to accept heap allocated struct opaque*s, but in function calls our stack object decays to a char*. There are a few conceivable options:

Force the user to compile with an equivalent of -Wno-incompatible-pointer-types
Force the user to manually cast in every call like func((struct opaque*)blackbox);
Resort to redefining stack_alloc_opaque() to use a throwaway identifier for the array, and then assign that to a struct opaque pointer within the macro. But now our macro has multiple statements and we're polluting the namespace with an identifier the user doesn't know about.

All of those are pretty undesirable in their own way, and none address the underlying problem that while char* may alias any type, the inverse is not true. Even though our char[] is perfectly aligned and sized for a struct opaque, reinterpreting it as one through a pointer cast is verboten. And we can't use a union to do it, because struct opaque is an incomplete type. Unfortunately that means that the only alias-safe solution is:

Have every method in our API accept a char* or typedef to char* rather than struct opaque*. This allows the API to accept both pointer types, while losing all semblance of type safety in the process. To make matters worse, any operations within the API will require memcpying the function's argument into and back out of a local struct opaque.

Which is rather monstrous. Even if we disregard strict aliasing, the only way to maintain the same API for heap and stack objects in this situation is the first item (don't do that).

On the matter of disregarding the standard, there is one other thing:

alloca

It's a bad word, but I'd be remiss not to mention it. Unlike a char VLA, and like malloc, alloca returns a void pointer to untyped space. Since it has roughly the same semantics as malloc, its use doesn't require any of the gymnastics listed above. Heap and stack API could happily live side by side, differing only in object (de)allocation. But alloca is nonstandard, the returned objects have a slightly different lifetime than a VLA, and its use is near universally discouraged. Unfortunate that it is otherwise well suited to this problem.

As far as I can see, there is only one correct solution (#4), only one clean solution (#5), and no good solution. The way you define the rest of the API depends on which of those you choose.

Using `alloca` as apart of an indirectly is rather risky since since will keep getting stack memory within a loop. IMHO wrapping it is fine, but better include `alloca` in the macro name else it can be misleading and hide nasty bugs. — ideasman42, Mar 16 '15 at 04:03

score 1 · Answer 2 · answered Apr 02 '14 at 03:16

1

In some cases this makes very little sense from a memory management perspective, especially if the struct is small, and its allocated and freed a lot.

I don't see the problem here. In your example, someone probably will only use one RND for the lifetime of their program, or at least, a small number of them.

And if the struct is allocated and freed a lot then it makes no performance difference whether your library does all the allocating and freeing, or whether their code does it.

If you want to permit automatic allocation, then the caller will have to know the size of your struct. There is no way of getting around this. Also, this somewhat defeats the purpose of hiding your implementation, as it means you can't change the size of your struct without breaking the client code.

Further, they will have to allocate memory that is correctly aligned for your struct (i.e. they can't just go char foo[SIZE_OF_RND]; RND *rng = (RND *)foo; because of alignment issues). Your RND_STACK_VAR example ignores this problem.

Perhaps you could publish a SIZE_OF_RND that is the actual size, plus some allowance for alignment. Then your "new" function uses some hacks to find the right alignment location in that memory and returns a pointer.

If it feels kludgey, that's because it is. And there is nothing stopping them just writing the bytes inside the RND anyway. I would just use your first suggestion of RND_new() etc. unless there were a very strong reason why it wasn't suitable.

answered Apr 02 '14 at 03:16

M.M

138,810
21
208
365

In the example `RND` is only used once, but in my question it states it may be allocated and freed many times. Also it will make a difference if stack memory is used instead of heap memory. – ideasman42 Apr 02 '14 at 04:04
@ideasman42 If you forget for a moment about `alloca()`, there's no such thing as variable-size stack allocations. The `int bar = 3; char foo[bar];` hack is a non-standard extension. – Kuba hasn't forgotten Monica Apr 02 '14 at 04:18
3

@KubaOber No, that are variable-length arrays, standard in c99. – this Apr 02 '14 at 04:18
@Kuba Ober, the size can be known, the issue I want to avoid is direct access to struct members. – ideasman42 Apr 02 '14 at 04:22
@ideasman42 The size can only be known if you expose it explicitly, say by a `Foo_size()` function. The question is: what do you do with such a size? Without a C99 compiler, use of non-standard extensions, or `alloca`, you can't use it for stack allocations. – Kuba hasn't forgotten Monica Apr 02 '14 at 04:29
@Kuba Ober, if the size is known at compile time its possible to allocate on the stack, however there can be alignment issues as has been pointed out. – ideasman42 Apr 02 '14 at 04:48
@ideasman42 The size *doesn't have to be known* at compile time, and then you use `alloca`, and you don't have alignment issues either, assuming that `alloca` aligns the same as `malloc` (it better!). The compile-time knowledge is in fact counter-productive as it precludes binary compatibility in face of implementation changes. `alloca` is not substantially more expensive than a fixed-size allocation. On most architectures it can be implemented inline in a few machine instructions. – Kuba hasn't forgotten Monica Apr 02 '14 at 05:40
@Kuba Ober, `alloca` **isn't** equivalent to using stack defined vars because it behaves differently when called in a loop, each time it uses more stack memory (which can easily use all stack memory) and crash. – ideasman42 Apr 02 '14 at 05:54
@ideasman42 Nothing's free in this world. Either way, you have to call some sort of a destructor function if the object requires destruction, so it's not like you can use automatic objects "mindlessly" as you would in C++. In a nutshell, C should not be used if a platform has a C++ compiler available, that's about the best advice I can give. If the project is to be large and the platform simple, it may be less work to port LLVM/clang to it and use it, instead of dealing with C. BTDT. – Kuba hasn't forgotten Monica Apr 02 '14 at 05:59
Using C++ would not make any difference to this issue – M.M Apr 02 '14 at 05:59
@KubaOber, I don't really get your points - this is a question about C, not C++. And I already showed in my question - 2 possible solutions that don't have to rely on VLA's or `alloca`. – ideasman42 Dec 10 '15 at 06:32

score 0 · Answer 3 · answered Apr 02 '14 at 03:15

0

A third solution: split your header file into a public and private parts, and declare the struct in the public part and define it in the private, non-exportable one.

Thus external users of your library won't get exact implementation, while your library internals would use a common definition w/o additional efforts.

answered Apr 02 '14 at 03:15

user3159253

16,836
3
30
56

2

Then the users cannot automatically allocate a struct (i.e. `struct FOO foo;`). It seems OP really wants to do this and he is not happy with the solution of the caller doing `struct FOO *foo = FOO_new();` – M.M Apr 02 '14 at 03:17
Well, the standard C++ approach for structs of a small predefined size is to use preallocated memory pools (a pool per a struct size), so there's almost no reason to prefer allocation data on stack instead of the memory pool. Also if you're stuck to GCC I would use "cleanup" function when your variable goes out of scope, see http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html and http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization#GCC_extensions_for_C – user3159253 Apr 02 '14 at 03:30
All these steps could be wrapped into a handy macro, which would hide the gory details of variable allocation. The only concern left is speed of heap pool vs stack speed, but only performance tests can tell the difference in your applications. – user3159253 Apr 02 '14 at 03:38
Having a pool for the struct can be OK but it adds extra complexity. If your writing threaded code suddenly you have to worry about locking, or write a lockfree pool using atomic ops, also debugging memory errors with a pool confuses memory checking tools making their output less useful. While memory pools are great it can significantly increase complexity. – ideasman42 Apr 30 '15 at 23:42

user3386109 · Answer 4 · 2014-04-02T03:35:25.427

Here's yet another way to do it. Just as in your first solution, it's important to keep the size in sync, or really bad things happen.

main.c

#include <stdio.h>
#include "somestruct.h"

int main( void )
{
    SomeStruct test;

    InitSomeStruct( &test );
    ShowSomeStruct( &test );
}

somestruct.h

#define SOME_STRUCT_SIZE ((sizeof(int) * 2 + sizeof(long double) - 1) / sizeof(long double))

typedef struct
{
    union
    {
        long double opaque[SOME_STRUCT_SIZE];

#ifdef _SOME_STRUCT_SOURCE_
        struct
        {
            int a;
            int b;
        };
#endif

    };
}
    SomeStruct;

void InitSomeStruct( SomeStruct *someStruct );
void ShowSomeStruct( SomeStruct *someStruct );

somestruct.c

#include <stdio.h>
#define _SOME_STRUCT_SOURCE_
#include "somestruct.h"

void InitSomeStruct( SomeStruct *someStruct )
{
    someStruct->a = 55;
    someStruct->b = 99;
}

void ShowSomeStruct( SomeStruct *someStruct )
{
    printf( "a=%d b=%d\n", someStruct->a, someStruct->b );
}

That seems to have the same problem with alignment as http://stackoverflow.com/a/22800858/2327831 — this, Apr 02 '14 at 03:30
@self. ok, how bout now? You could, of course, also use a `#pragma align`, but that seems to vary by system. — user3386109, Apr 02 '14 at 03:35

Kuba hasn't forgotten Monica · Answer 5 · 2015-03-16T02:56:13.147

In the design of a C-based API, it makes little sense not to have a default functionality of allocation and initialization bundled together, ready for use - just like in C++. Not offering it as the default means of "getting" an instance of an object makes it all too easy to use uninitialized storage. If one sticks to this rule, there's no need to expose any sizes at all.

Atomic stack allocation and initialization works great for classes that don't require destruction. For such objects, an alloca-based factory function is a viable option in addition to the "default" malloc-based factory function.

The use of destruction-requiring classes is less obvious, since the "instinct" with alloca-allocated variables is not to have to free them. At least if one sticks to the "factory" APIs for both object construction and destruction, it's rather easy to ensure by policy and code checking that the destruction either happens, or the object leaks. An alloca-ted object's memory won't ever leak, but it may be forgotten to be destructed and its resources (including additional memory!) can certainly leak.

Suppose we have an interface for a 24 bit arithmetic type, written in the style of C interfaces and Implementations.

#ifndef INT24_INCLUDED
#define INT24_INCLUDED
#define T Int24_T
typedef struct T *T;
extern T Int24_new(void);
extern void Int24_free(T**);
extern void Int24_set_fromint(T, int);
extern void Int24_add(T a, T b);
extern int Int24_toint(T);
...
#undef T
#endif

The Int24_new function returns a new 24-bit integer allocated on the heap, and there's nothing that needs to be done to destruct it when freeing it:

struct T {
  int val:24;
};    

T Int24_new(void) {
  T int24 = malloc(sizeof(struct T));
  int24->val = 0;
  return int24;
}

void Int24_free(T ** int24) {
  assert(int24);
  free(*int24);
  *int24 = NULL;
}

We can have an Int24_auto macro that does the same, but allocates on the stack. We can't call alloca() within the function, since the the moment we return it, it's a dangling pointer - return from the function "deallocates" the memory. The use of Int24_free on such an object would be an error.

#define Int24_auto() Int24_auto_impl(alloca(sizeof(struct T)))
T Int24_auto_impl(void * addr) {
  T int24 = addr;
  int24->val = 0;
  return int24;
}

The use is straightforward, there's no destruction to be forgotten about, but the API is not consistent: we must not free objects gotten through Int24_auto.

void test(void) {
  Int24_T a = Int24_auto();
  Int24_T b = Int24_auto();
  Int24_set_fromint(a, 1);
  Int24_set_fromint(b, 2);
  Int24_add(a, b);
  assert(Int24_toint(a) == 3);
}

If we can live with the overhead, it's desirable to add a flag to the implementation that lets the free method destruct the instance without treating it as if it were allocated on the heap.

struct T {
  int val:24;
  int is_auto:1;
};    

T Int24_new(void) {
  T int24 = malloc(sizeof(struct T));
  int24->val = 0;
  int24->is_auto = 0;
  return int24;
}

#define Int24_auto() Int24_auto_impl(alloca(sizeof(struct T)))
T Int24_auto_impl(void * addr) {
  T int24 = addr;
  int24->val = 0;
  int24->is_auto = 1;
  return int24;
}

void Int24_free(T ** int24) {
  assert(int24);
  if (!(*int24)->is_auto) free(*int24);
  *int24 = NULL;
}

This makes the heap- and stack-allocated uses consistent:

void test(void) {
  Int24_T a = Int24_auto();
  ...
  Int24_free(&a);
  a = Int24_new();
  ...
  Int24_free(&a);
}

We can, of course, have an API that returns the size of an opaque type, and exposes the init and release methods that construct an object in-place, and destruct it, respectively. The use of such methods is more verbose, and requires more care. Suppose we have an array type:

#ifndef ARRAY_INCLUDED
#define ARRAY_INCLUDED
#define T Array_T
typedef struct T *T;
extern size_t Array_alloc_size(void);
extern void Array_init(T, int length, int size);
extern void Array_release(T);
...
#undef T
#endif

This allows the flexibility in choosing the allocator that we want, at the expense of 1 or 2 extra lines of code per each used object.

void test(void) {
  Array_T a = alloca(Array_alloc_size());
  Array_init(a, 10, sizeof(int));
  ...
  Array_release(a);

  a = malloc(Array_alloc_size());
  Array_init(a, 5, sizeof(void*));
  ...
  Array_release(a);
  free(a);
}

I'd consider such an API to be too error prone, especially that it makes certain kinds of future implementation detail changes rather cumbersome. Suppose that we were to optimize our array by allocating all storage in one go. This would require the alloc_size method to take the same parameters as init. This seems outright stupid when the new and auto factory methods can take care of it in one go, and retain the binary compatibility in spite of implementation changes.

C best practice for using stack memory for incomplete structs

Possible solutions

Declare the size only (in the header)

Conditionally hide the struct members (in the header)

5 Answers5

Conditionally hide the `struct` members (in the header)