How to define an array of structs at compile time composed of static (private) structs from separate modules?

Question

This question is something of a trick C question or a trick clang/gcc question. I'm not sure which.

I phrased it like I did because the final array is in main.c, but the structs that are in the array are defined in C modules.

The end goal of what I am trying to do is to be able to define structs in seperate C modules and then have those structs be available in a contiguous array right from program start. I do not want to use any dynamic code to declare the array and put in the elements.

I would like it all done at compile or link time -- not at run time.

I'm looking to end up with a monolithic blob of memory that gets setup right from program start.

For the sake of the Stack Overflow question, I thought it would make sense if I imagined these as "drivers" (like in the Linux kernel) Going with that...

Each module is a driver. Because the team is complex, I do not know how many drivers there will ultimately be.

Requirements:

Loaded into contiguous memory (an array)
Loaded into memory at program start
installed by the compiler/linker, not dynamic code
a driver exists because source code exists for it (no dynamic code to load them up)
Avoid cluttering up the code

Here is a contrived example:

// myapp.h
//////////////////////////

struct state
{
    int16_t data[10];
};

struct driver
{
    char name[255];
    int16_t (*on_do_stuff) (struct state *state);
    /* other stuff snipped out */
};

// drivera.c
//////////////////////////

#include "myapp.h"

static int16_t _on_do_stuff(struct state *state)
{
    /* do stuff */
}

static const struct driver _driver = {
    .name = "drivera",
    .on_do_stuff = _on_do_stuff
};

// driverb.c
//////////////////////////

#include "myapp.h"

static int16_t _on_do_stuff(struct state *state)
{
    /* do stuff */
}

static const struct driver _driver = {
    .name = "driverb",
    .on_do_stuff = _on_do_stuff
};

// driverc.c
//////////////////////////

#include "myapp.h"

static int16_t _on_do_stuff(struct state *state)
{
    /* do stuff */
}

static const struct driver _driver = {
    .name = "driverc",
    .on_do_stuff = _on_do_stuff
};

// main.c
//////////////////////////

#include <stdio.h>

static struct driver the_drivers[] = {
    {drivera somehow},
    {driverb somehow},
    {driverc somehow},
    {0}
};

int main(void)
{
    struct state state;
    struct driver *current = the_drivers;

    while (current != 0)
    {
        printf("we are up to %s\n", current->name);
        current->on_do_stuff(&state);
        current += sizeof(struct driver);
    }

    return 0;
}

This doesn't work exactly.

Ideas:

On the module-level structs, I could remove the static const keywords, but I'm not sure how to get them into the array at compile time
I could move all of the module-level structs to main.c, but then I would need to remove the static keyword from all of the on_do_stuff functions, and thereby clutter up the namespace.

In the Linux kernel, they somehow define kernel modules in separate files and then through linker magic, they are able to be loaded into monolithics

Nominal Animal · Accepted Answer · 2018-05-29T03:00:11.057

Use a dedicated ELF section to "collect" the data structures.

For example, define your data structure in info.h as

#ifndef   INFO_H
#define   INFO_H

#ifndef  INFO_ALIGNMENT
#if defined(__LP64__)
#define  INFO_ALIGNMENT  16
#else
#define  INFO_ALIGNMENT  8
#endif
#endif

struct info {
    long  key;
    long  val;
} __attribute__((__aligned__(INFO_ALIGNMENT)));

#define  INFO_NAME(counter)  INFO_CAT(info_, counter)
#define  INFO_CAT(a, b)      INFO_DUMMY() a ## b
#define  INFO_DUMMY()

#define  DEFINE_INFO(data...) \
         static struct info  INFO_NAME(__COUNTER__) \
             __attribute__((__used__, __section__("info"))) \
             = { data }

#endif /* INFO_H */

The INFO_ALIGNMENT macro is the alignment used by the linker to place each symbol, separately, to the info section. It is important that the C compiler agrees, as otherwise the section contents cannot be treated as an array. (You'll obtain an incorrect number of structures, and only the first one (plus every N'th) will be correct, the rest of the structures garbled. Essentially, the C compiler and the linker disagreed on the size of each structure in the section "array".)

Note that you can add preprocessor macros to fine-tune the INFO_ALIGNMENT for each of the architectures you use, but you can also override it for example in your Makefile, at compile time. (For GCC, supply -DINFO_ALIGNMENT=32 for example.)

The used attribute ensures that the definition is emitted in the object file, even though it is not referenced otherwise in the same data file. The section("info") attribute puts the data into a special info section in the object file. The section name (info) is up to you.

Those are the critical parts, otherwise it is completely up to you how you define the macro, or whether you define it at all. Using the macro is easy, because one does not need to worry about using unique variable name for the structure. Also, if at least one member is specified, all others will be initialized to zero.

In the source files, you define the data objects as e.g.

#include "info.h"

/* Suggested, easy way */
DEFINE_INFO(.key = 5, .val = 42);

/* Alternative way, without relying on any macros */
static struct info  foo  __attribute__((__used__, __section__("info"))) = {
    .key = 2,
    .val = 1
};

The linker provides symbols __start_info and __stop_info, to obtain the structures in the info section. In your main.c, use for example

#include "info.h"

extern struct info  __start_info[];
extern struct info  __stop_info[];

#define  NUM_INFO  ((size_t)(__stop_info - __start_info))
#define  INFO(i)   ((__start_info) + (i))

so you can enumerate all info structures. For example,

int main(void)
{
    size_t  i;

    printf("There are %zu info structures:\n", NUM_INFO);
    for (i = 0; i < NUM_INFO; i++)
        printf("  %zu. key=%ld, val=%ld\n", i,
            __start_info[i].key, INFO(i)->val);

    return EXIT_SUCCESS;
}

For illustration, I used both the __start_info[] array access (you can obviously #define SOMENAME __start_info if you want, just make sure you do not use SOMENAME elsewhere in main.c, so you can use SOMENAME[] as the array instead), as well as the INFO() macro.

Let's look at a practical example, an RPN calculator.

We use section ops to define the operations, using facilities defined in ops.h:

#ifndef   OPS_H
#define   OPS_H
#include <stdlib.h>
#include <errno.h>

#ifndef  ALIGN_SECTION
#if defined(__LP64__) || defined(_LP64)
#define  ALIGN_SECTION  __attribute__((__aligned__(16)))
#elif defined(__ILP32__) || defined(_ILP32)
#define  ALIGN_SECTION  __attribute__((__aligned__(8)))
#else
#define  ALIGN_SECTION
#endif
#endif

typedef struct {
    size_t   maxsize;   /* Number of values allocated for */
    size_t   size;      /* Number of values in stack */
    double  *value;     /* Values, oldest first */
} stack;
#define  STACK_INITIALIZER  { 0, 0, NULL }

struct op {
    const char  *name;            /* Operation name */
    const char  *desc;            /* Description */
    int        (*func)(stack *);  /* Implementation */
} ALIGN_SECTION;

#define  OPS_NAME(counter)  OPS_CAT(op_, counter, _struct)
#define  OPS_CAT(a, b, c)   OPS_DUMMY()  a ## b ## c
#define  OPS_DUMMY()

#define  DEFINE_OP(name, func, desc) \
         static struct op  OPS_NAME(__COUNTER__) \
         __attribute__((__used__, __section__("ops"))) = { name, desc, func }

static inline int  stack_has(stack *st, const size_t num)
{
    if (!st)
        return EINVAL;

    if (st->size < num)
        return ENOENT;

    return 0;
}

static inline int  stack_pop(stack *st, double *to)
{
    if (!st)
        return EINVAL;

    if (st->size < 1)
        return ENOENT;

    st->size--;

    if (to)
        *to = st->value[st->size];

    return 0;
}


static inline int  stack_push(stack *st, double val)
{
    if (!st)
        return EINVAL;

    if (st->size >= st->maxsize) {
        const size_t  maxsize = (st->size | 127) + 129;
        double       *value;

        value = realloc(st->value, maxsize * sizeof (double));
        if (!value)
            return ENOMEM;

        st->maxsize = maxsize;
        st->value   = value;
    }

    st->value[st->size++] = val;

    return 0;
}

#endif /* OPS_H */

The basic set of operations is defined in ops-basic.c:

#include "ops.h"

static int do_neg(stack *st)
{
    double  temp;
    int     retval;

    retval = stack_pop(st, &temp);
    if (retval)
        return retval;

    return stack_push(st, -temp);
}

static int do_add(stack *st)
{
    int  retval;

    retval = stack_has(st, 2);
    if (retval)
        return retval;

    st->value[st->size - 2] = st->value[st->size - 1] + st->value[st->size - 2];
    st->size--;

    return 0;
}

static int do_sub(stack *st)
{
    int  retval;

    retval = stack_has(st, 2);
    if (retval)
        return retval;

    st->value[st->size - 2] = st->value[st->size - 1] - st->value[st->size - 2];
    st->size--;

    return 0;
}

static int do_mul(stack *st)
{
    int  retval;

    retval = stack_has(st, 2);
    if (retval)
        return retval;

    st->value[st->size - 2] = st->value[st->size - 1] * st->value[st->size - 2];
    st->size--;

    return 0;
}

static int do_div(stack *st)
{
    int  retval;

    retval = stack_has(st, 2);
    if (retval)
        return retval;

    st->value[st->size - 2] = st->value[st->size - 1] / st->value[st->size - 2];
    st->size--;

    return 0;
}

DEFINE_OP("neg", do_neg, "Negate current operand");
DEFINE_OP("add", do_add, "Add current and previous operands");
DEFINE_OP("sub", do_sub, "Subtract previous operand from current one");
DEFINE_OP("mul", do_mul, "Multiply previous and current operands");
DEFINE_OP("div", do_div, "Divide current operand by the previous operand");

The calculator expects each value and operand to be a separate command-line argument for simplicity. Our main.c contains operation lookup, basic usage, value parsing, and printing the result (or error):

#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
#include "ops.h"

extern struct op __start_ops[];
extern struct op  __stop_ops[];

#define  NUM_OPS  ((size_t)(__stop_ops - __start_ops))


static int  do_op(stack *st, const char *opname)
{
    struct op  *curr_op;

    if (!st || !opname)
        return EINVAL;

    for (curr_op = __start_ops; curr_op < __stop_ops; curr_op++)
        if (!strcmp(opname, curr_op->name))
            break;

    if (curr_op >= __stop_ops)
        return ENOTSUP;

    return curr_op->func(st);
}


static int  usage(const char *argv0)
{
    struct op  *curr_op;

    fprintf(stderr, "\n");
    fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv0);
    fprintf(stderr, "       %s RPN-EXPRESSION\n", argv0);
    fprintf(stderr, "\n");
    fprintf(stderr, "Where RPN-EXPRESSION is an expression using reverse\n");
    fprintf(stderr, "Polish notation, and each argument is a separate value\n");
    fprintf(stderr, "or operator. The following operators are supported:\n");

    for (curr_op = __start_ops; curr_op < __stop_ops; curr_op++)
        fprintf(stderr, "\t%-14s  %s\n", curr_op->name, curr_op->desc);

    fprintf(stderr, "\n");

    return EXIT_SUCCESS;
}


int main(int argc, char *argv[])
{
    stack  all = STACK_INITIALIZER;
    double val;
    size_t i;
    int    arg, err;
    char   dummy;

    if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help"))
        return usage(argv[0]);

    for (arg = 1; arg < argc; arg++)
        if (sscanf(argv[arg], " %lf %c", &val, &dummy) == 1) {
            err = stack_push(&all, val);
            if (err) {
                fprintf(stderr, "Cannot push %s to stack: %s.\n", argv[arg], strerror(err));
                return EXIT_FAILURE;
            }
        } else {
            err = do_op(&all, argv[arg]);
            if (err == ENOTSUP) {
                fprintf(stderr, "%s: Operation not supported.\n", argv[arg]);
                return EXIT_FAILURE;
            } else
            if (err) {
                fprintf(stderr, "%s: Cannot perform operation: %s.\n", argv[arg], strerror(err));
                return EXIT_FAILURE;
            }
        }

    if (all.size < 1) {
        fprintf(stderr, "No result.\n");
        return EXIT_FAILURE;
    } else
    if (all.size > 1) {
        fprintf(stderr, "Multiple results:\n");
        for (i = 0; i < all.size; i++)
            fprintf(stderr, "  %.9f\n", all.value[i]);
        return EXIT_FAILURE;
    }

    printf("%.9f\n", all.value[0]);
    return EXIT_SUCCESS;
}

Note that if there were many operations, constructing a hash table to speed up the operation lookup would make a lot of sense.

Finally, we need a Makefile to tie it all together:

CC      := gcc
CFLAGS  := -Wall -O2 -std=c99
LDFLAGS := -lm
OPS     := $(wildcard ops-*.c)
OPSOBJS := $(OPS:%.c=%.o)
PROGS   := rpncalc

.PHONY: all clean

all: clean $(PROGS)

clean:
        rm -f *.o $(PROGS)

%.o: %.c
        $(CC) $(CFLAGS) -c $^

rpncalc: main.o $(OPSOBJS)
        $(CC) $(CFLAGS) $^ $(LDFLAGS) -o $@

Because this forum does not preserve Tabs, and make requires them for indentation, you probably need to fix the indentation after copy-pasting the above. I use sed -e 's|^ *|\t|' -i Makefile

If you compile (make clean all) and run (./rpncalc) the above, you'll see the usage information:

Usage: ./rpncalc [ -h | --help ]
       ./rpncalc RPN-EXPRESSION

Where RPN-EXPRESSION is an expression using reverse
Polish notation, and each argument is a separate value
or operator. The following operators are supported:
        div             Divide current operand by the previous operand
        mul             Multiply previous and current operands
        sub             Subtract previous operand from current one
        add             Add current and previous operands
        neg             Negate current operand

and if you run e.g. ./rpncalc 3.0 4.0 5.0 sub mul neg, you get the result 3.000000000.

Now, let's add some new operations, ops-sqrt.c:

#include <math.h>
#include "ops.h"

static int do_sqrt(stack *st)
{
    double  temp;
    int     retval;

    retval = stack_pop(st, &temp);
    if (retval)
        return retval;

    return stack_push(st, sqrt(temp));
}

DEFINE_OP("sqrt", do_sqrt, "Take the square root of the current operand");

Because the Makefile above compiles all C source files beginning with ops- in to the final binary, the only thing you need to do is recompile the source: make clean all. Running ./rpncalc now outputs

Usage: ./rpncalc [ -h | --help ]
       ./rpncalc RPN-EXPRESSION

Where RPN-EXPRESSION is an expression using reverse
Polish notation, and each argument is a separate value
or operator. The following operators are supported:
        sqrt            Take the square root of the current operand
        div             Divide current operand by the previous operand
        mul             Multiply previous and current operands
        sub             Subtract previous operand from current one
        add             Add current and previous operands
        neg             Negate current operand

and you have the new sqrt operator available.

Testing e.g. ./rpncalc 1 1 1 1 add add add sqrt yields 2.000000000, as expected.

@010110110101: I like to do this, uh, beast-modey answering thing, to questions that I believe others will encounter/have encountered, too, and that I think I can answer in a way that will give them enough of a boost to build something very interesting out of. (I often learn a lot, myself too, when writing the test cases; I don't like to ask others to just trust my word, I want them to be able to test and verify.) I try to do it in a way that helps them build that something that is both robust and portable. So, not a gauntlet per se.. more like very tough rope. :) — Nominal Animal, May 27 '18 at 23:06
@010110110101: Thanks! I do hope you and your team can use this to build some useful and interesting tooling, not just saving time and effort, but opening up minds for new and innovative solutions. *That* is what makes Linux so interesting, in my opinion. — Nominal Animal, May 27 '18 at 23:20

How to define an array of structs at compile time composed of static (private) structs from separate modules?

1 Answers1

Linked