29

I am not an expert C programmer and I know that including .c source file from another is considered bad practice, but I have a situation where I think it could help maintainability.

I have a big structure with a lot of elements and I use #define to keep the indexes.

#define TOTO_IND 0 
#define TITI_IND 1 
…
#define TATA_IND 50

static const MyElements elems [] = {
    {"TOTO", 18, "French"},
    {"TITI", 27, "English"},
    ...,
    {"TATA", 45, "Spanish"}
}

Since I need to access the structure from index, I need to keep the #define and the structure declaration synchronized. That means that I must insert new elements at the right place and update the #define accordingly.

It is error prone and I don’t really like it (but for performance consideration, I didn’t find a better solution).

Anyway, this file also contains a lot of functions to handle this structure. I also want to keep separation of code and avoid global variables.

To make things “easier”, I was thinking about moving this “error prone definition” to a single .c source file which would only contain this structure. This file would be “the dangerous be careful file” and include it in my actual “normal functional” file.

What do you think about it? Is it a valid situation for including .c source file? Is there another better way of handling my structure?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ncenerar
  • 1,517
  • 12
  • 25
  • 21
    Even if the file contains a variable definition, you could still name the file with a `.h` ending. – Some programmer dude Oct 11 '18 at 10:23
  • 8
    Maybe better if you name the file to be included as "file.inc" or "file.c.inc"? – pmg Oct 11 '18 at 10:26
  • 12
    Are you aware that every other C file that requires the `#define`stuff, also will get its own copy of the array if this file is included? – Gerhardh Oct 11 '18 at 10:33
  • 1
    Why would this be `static const` instead of `extern const`? In these situations, x-macros can be helpful, as long as your programming team won't hurt you for using them. But it's also likely you are doing something wrong, since having hardcoded defines means you need to use these exact constants at compile time. Post some code where the defines are being used. – vgru Oct 11 '18 at 10:37
  • 7
    What am I missing? Why don't you put the structure in its own .C file and `extern` it where you need to access it? – Mawg says reinstate Monica Oct 11 '18 at 13:52
  • And what does `MyElements ` look like? – Mawg says reinstate Monica Oct 11 '18 at 13:53
  • 2
    Would it be acceptable to declare the data together in one file of some format, then write a small utility (to run on the build system as part of the build process) to read that file and generate both a .c file and a .h file from that which are both marked with AUTOGENERATED - DO NOT TOUCH, EDIT source file name INSTEAD? – Daniel Schepler Oct 12 '18 at 01:05
  • Sounds like a good use case for a "data file" that gets preprocessed into both header and source by some pre-compilation build step. Maybe a nice CSV file and a Python script? – Lightness Races in Orbit Oct 12 '18 at 12:24

5 Answers5

40

You could use designated initializers to initialize the elements of elems[] without having to know the explicit value of each index identifier (or macro).

const MyElements elems[] = {
    [TOTO_IND] = {"TOTO", 18, "French"},
    [TITI_IND] = {"TITI", 27, "English"},
    [TATA_IND] = {"TATA", 45, "Spanish"},
};

The array elements will be initialized the same way, even if you change the order they appear in the source code:

const MyElements elems[] = {
    [TITI_IND] = {"TITI", 27, "English"},
    [TATA_IND] = {"TATA", 45, "Spanish"},
    [TOTO_IND] = {"TOTO", 18, "French"},
};

If the array length is set automatically from the initializer as above (i.e. by using [] rather than [NUM_ELEMS]), then the length will be the one more than the maximum element index.

This allows you to keep the index values and external declaration of the elems array in a .h file, and define the elems array contents in a separate .c file.

Ian Abbott
  • 15,083
  • 19
  • 33
  • This is the correct answer in modern C. I posted an answer with additional advise based on this. – Lundin Oct 11 '18 at 11:10
  • 2
    This still doesn't prevent OP from writing `[TOTO_IND] = {"TITI", 27, "English"}` though. – vgru Oct 11 '18 at 11:43
  • @Konrad: well, OP said the original code is "error prone", so it makes sense to point out this doesn't prevent mistakes at compile time (even after including the static assert by Lunding). – vgru Oct 11 '18 at 13:14
  • 1
    @Groo The original was more error prone than this, I think. Adding comments to each element initializer to indicate which index symbol it is associated with would have helped, but designated initializers are better. – Ian Abbott Oct 11 '18 at 13:50
  • @Groo Well, there's just so much we can do to protect the programmer from themselves. We _could_ take it one step further still and use X macros, but that will turn the whole code unreadable. – Lundin Oct 11 '18 at 14:09
  • 5
    important to note designated initialisers are a C99 feature, will not compile in C95, and cannot be used in C++11 – cat Oct 11 '18 at 15:09
  • @Lundin: I first thought of adding an answer with an x-macro example, but I knew you would probably provide a way which doesn't need it anyway. :-) – vgru Oct 11 '18 at 16:32
  • one trick I do in C++ but probably works in C as well: `const MyElements[] = {}; const (&MyElementsCheck)[50]=MyElements;`, because then this gives you a compile time error if the number of elements isn't the expected value: http://coliru.stacked-crooked.com/a/26e65750d1f9cfd1. This gives provides a backup check that indecies aren't assigned twice and all are assigned. – Mooing Duck Oct 11 '18 at 20:32
  • 2
    To address @Groo's point, OP could use a macro (e.g. `#define decl(idx, n, lang) [idx##_IND] = {#idx, n, lang}`. – Kevin Oct 12 '18 at 00:04
  • I didn't think about this solution but It definitively answer my problem since it remove the need to stay synchronized between the #define declaration and the order of each element. Thank you! – ncenerar Oct 12 '18 at 09:21
33

You should use designated initializers as shown in the answer by Ian Abbot.

Additionally, if the array indices are adjacent as seems to be the case here, you can use an enum instead:

toto.h

typedef enum
{
  TOTO_IND,
  TITI_IND,
  ...
  TATA_IND,
  TOTO_N    // this is not a data item but the number of items in the enum
} toto_t;

toto.c

const MyElements elems [] = {
  [TITI_IND] = {"TITI", 27, "English"},
  [TATA_IND] = {"TATA", 45, "Spanish"},
  [TOTO_IND] = {"TOTO", 18, "French"},
};

And now you can verify the data integrity of the array as whole with a static assert:

_Static_assert(sizeof elems/sizeof *elems == TOTO_N, 
               "Mismatch between toto_t and elems is causing rain in Africa");

_Static_assert(sizeof elems/sizeof *elems == TOTO_N, ERR_MSG);

where ERR_MSG is defined as

#define STR(x) STR2(x)
#define STR2(x) #x
#define ERR_MSG "Mismatching toto_t. Holding on line " STR(__LINE__)
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Another possible integrity error is that some elements of `elem` might not have been initialized explicitly and so have default initializers. Run-time code should check for elements that haven't been initialized explicitly, since I don't think it can be done with `_Static_assert`. – Ian Abbott Oct 11 '18 at 13:00
  • 7
    @GiacomoAlzetta It was a bad joke - rain in Africa is definitely a blessing. I have updated the question now, with an assert pointing out the offending line. – Lundin Oct 11 '18 at 14:03
  • @IanAbbott They will be set to zero/NULL though, so if the code checks against that it should be ok. You can actually check if the initializer list contains the right number of items, with a somewhat dirty trick: Toss in something like `typedef char BEGIN_CHECK[__LINE__];` just before the array declaration, then a similar one after. Static assert against the difference in size between the two types. But that trick does of course not take white space, comments etc in account. – Lundin Oct 11 '18 at 14:57
  • should probably note that designated inits are a C99 feature and might not be suitable for embedded targets – cat Oct 11 '18 at 15:11
  • 2
    @cat I would have done that 20 years ago, yes. Most embedded systems compilers today support C11. A few of them might still only support C99. Also the C (and C++) tag usage policy is that the current standard (C17) is assumed unless we are explicitly told otherwise by the OP. – Lundin Oct 11 '18 at 15:13
  • A limitation of the designated-initializer approach is that there is no compile-time validation to ensure that all members of the `enum` have associated initializers. The "x-macros" approach described by Groo will ensure that the enum and the array both have the same items in the same sequence, and in some cases could also add additional consistency assurance (e.g. by concatenating supplied names with `_IND` to generate indices, and stringizing them for use as initialization values). – supercat Oct 11 '18 at 21:29
  • As I mentioned in a comment on another answer, if you want safety on the index/string matching, you can use a macro like `#define decl(idx, n, lang) [idx##_IND] = {#idx, n, lang}`. – Kevin Oct 12 '18 at 00:07
  • I liked @IanAbbott answer but you went even farther with the enum idea. Thank you! ;) – ncenerar Oct 12 '18 at 09:38
  • @Kevin: That won't generate a compiler squawk in case of duplicated or missing initialization values. Groo's approach ensures there's a Single Source of Truth by using a single list to create both the enum and the initialization (btw. the same approach can also be extended to generate bit masks as well, when necessary). – supercat Oct 12 '18 at 14:51
  • @supercat Yeah, I do like Groo's answer more (uses a similar macro too), but the macro I gave makes sure the titi/tata/toto matches up and that's all I claimed. – Kevin Oct 12 '18 at 16:12
  • @Kevin: Fair enough. Personally, I find the C preprocessor frustratingly close to the edge of being useful--generally making it possible to do things, but not cleanly. It would be fairly easy to add some extensions to make it much more powerful (e.g. an intrinsic which would accept an integer expression and some kind of format specifier and yield a token with that value in that format) but the Standard has largely quashed innovation. – supercat Oct 12 '18 at 16:31
  • @ncenerar In all fairness I think you should accept Ian's answer and not mine. This is mostly to be regarded as a complement to his answer. – Lundin Oct 12 '18 at 17:07
  • @Lundin I must admit that it was hard to "elect the right answer". Indeed, I would like to give more credit to Ian's answer. And Groo's answer is also very clever and deserve a lot of vote. But from a reader point of view, I think your answer best correspond to the need of the question because it is complete and "easy". I think if one person is looking for an answer, yours should be the one, even if, I agree, in a fairness, Ian deserve more credit. – ncenerar Oct 15 '18 at 10:28
18

Other answers have already covered it in a clearer way, but for completeness sake, here is an x-macros approach, if you are willing to go down this route and risk your coworker's rage.

X-macros are a form of code generation, using the built-in C preprocessor. The goal is to reduce repetition to a minimum, although with some drawbacks:

  1. Source file which generates enums and structures using the preprocessor might seem complex if you are not used to them.
  2. Compared to an external build script which would generate the source files, with x-macros you never get to see how the generated code looks like during compilation, unless you use a compiler setting and check for the preprocessed file manually.
  3. Since you don't see the preprocessed output, you cannot use the debugger to step through the generated code in the same way you would do with code generated by an external script.

You start by creating a list of macro invocations in a separate file, e.g. elements.inc, without defining what the macro actually does at this point:

// elements.inc

// each row passes a set of parameters to the macro,
// although at this point we haven't defined what the
// macro will output

XMACRO(TOTO, 18, French)
XMACRO(TITI, 27, English)
XMACRO(TATA, 45, Spanish)

And then you define the macro every time you need to include this list, so that each invocation gets rendered into a single row of the construct you want to create -- and you usually repeat this several times in a row, i.e.

// concatenate id with "_IND" to create enums, ignore code and description
// (notice how you don't need to use all parameters each time)
// e.g. XMACRO(TOTO, 18, French) => TOTO_IND,
#define XMACRO(id, code, description) id ## _IND,
typedef enum
{
#    include "elements.inc"
     ELEMENTS_COUNT
}
Elements;
#undef XMACRO

// create struct entries
// e.g. XMACRO(TOTO, 18, French) => [TOTO_IND] = { "TOTO", 18, "French" },
#define XMACRO(id, code, description) [id ## _IND] = { #id, code, #description },
const MyElements elems[] = {
{
#    include "elements.inc"
};
#undef XMACRO

Which will get preprocessed into something like:

typedef enum
{
    TOTO_IND,
    TITI_IND,
    TATA_IND,
    ELEMENTS_COUNT
}
Elements;

const MyElements elems[] = {
{
    [TOTO_IND] = { "TOTO", 18, "French" },
    [TITI_IND] = { "TITI", 27, "English" },
    [TATA_IND] = { "TATA", 45, "Spanish" },
};

Obviously, frequent maintenance of the list becomes easier, at the expense of the generating code becoming more convoluted.

vgru
  • 49,838
  • 16
  • 120
  • 201
  • 1
    Or just use TOTO etc for the enum, since you'll probably want those to correspond to 0 to n anyway. I think it is handy to have the index itself inside the x macro. – Lundin Oct 11 '18 at 18:34
  • Very nice solution. A bit to confusing for me... I'll give it a try next time I need to refactor my code ;) – ncenerar Oct 12 '18 at 09:42
  • 1
    @ncenerar: yes, these macros are often not recommended since it's hard to visualize how they are expanded and you cannot debug them, i.e. you cannot step through the code. However, this is basically a form of code generation, using the C preprocessor instead of an external tool. When you have repetitive data like this, you can also configure an external build script which will generate the code for you whenever you change the list, which is much better since you end up with actual "debuggable" code. – vgru Oct 12 '18 at 09:49
  • Nice solution. You probably don't even need designated initializers since everything is put into the correct order by the `#include "elements.inc"`. I'd probably have included the string directly in the 3rd parameter of the macro (replacing `French` with `"French"` and `#description` with `description`) in case the programmer wants to use something other than a string literal for that parameter. – Ian Abbott Oct 12 '18 at 10:30
  • 3
    @Groo: An advantage of using the preprocessor is that it avoids the need for dependencies on any tools other than those required to build C programs, which can in turn facilitate porting to other build systems. – supercat Oct 12 '18 at 14:54
5

Defining the const as static in multiple files is not a good idea because it creates multiple instances of the large variable MyElements. This will increase memory in an embedded system. The static qualifier needs to be removed.

Here is a suggested solution:

in file.h

#define TOTO_IND 0 
#define TITI_IND 1 
…
#define TATA_IND 50
#define MAX_ELEMS 51

extern const MyElements elems[MAX_ELEMS];

in file.c

#include "file.h"
const MyElements elems [MAX_ELEMS] = {
    {"TOTO", 18, "French"},
    {"TITI", 27, "English"},
    ...,
    {"TATA", 45, "Spanish"}
}

After modification, place #include "file.h" in the required .c files.

Joshua Kim
  • 57
  • 4
Rishikesh Raje
  • 8,556
  • 2
  • 16
  • 31
  • 1
    “Contradiction” is not the right word to describe a `static const` desired for use in multiple files. There are some benefits to it. Making the array externally linked instead prevents some optimizations. – Eric Postpischil Oct 11 '18 at 10:52
  • 3
    @EricPostpischil If the array is required in multiple files, making it static will have multiple copies of the array. The OP mentioned that the array had a large size, so this option was disregarded. – Rishikesh Raje Oct 11 '18 at 11:00
  • you are right that `static` is bad here, but the way you phrase that line, sounds as though `static` for variables means the same as `static` function declarations (obviously it doesn't) – cat Oct 11 '18 at 15:10
  • I have modified the text to explain further. – Rishikesh Raje Oct 15 '18 at 04:01
1

To address the specific question about using #include with a .c file (the other answers suggest better options, especially the one by Groo), there is generally no need.

Everything in a .c file can be made externally visible and accessible, so you can reference it via function prototypes and #extern. So for example you could reference your table with #extern const MyElements elems []; in your main .c file.

Alternatively, you could put the definitions in a .h file and include that. That allows you to segregate the code as you want to. Keep in mind that all #include does is insert the contents of the included file where the #include statement is, so it's doesn't have to have any particular file extension. .h is used by convention and most IDEs will automatically add .c files to the list of files to be compiled, but as far as the compiler is concerned the naming is arbitrary.

shogged
  • 281
  • 2
  • 10
  • Compile-time constants like `enum` cannot generally be imported via `extern`. When linking with other languages, such imports may be sometimes kludged via something like ``extern uint32_t addr_is_link_date;` `#define unix_format_link_date ((uint32_t)&addr_is_link_date)`, which may be usable in some contexts that would otherwise require an integer constant [e.g. inclusion within a static const struct] but such cases are rare. – supercat Oct 12 '18 at 18:06