4

I'm currently transitioning to working in C, primarily focused on developing large libraries. I'm coming from a decent amount of application based programming in C++, although I can't claim expertise in either language.

What I'm curious about is when and why many popular open source libraries choose to not separate their code in a 1-1 relationship with a .h file and corresponding .c files -- even in instances where the .c isn't generating an executable.

In the past I'd been lead to believe that structuring code in this manner is optimal not only in organization, but also for linking purposes -- and I don't see how the lack of OOD features of C would effect this (not to mention not separating the implementation and interface also occurs in C++ libraries).

TolkienWASP
  • 309
  • 1
  • 3
  • 10
  • from the small experience i have, i would say factory classes that have 1 or 2 functions are usually implemented in the .h file only [factory methods](https://sourcemaking.com/design_patterns/factory_method/cpp/1) – Ghooo Sep 01 '15 at 16:15
  • 1
    Structuring code in a large library takes a lot of planning - which isn't always an option in open-source projects with changing developer bases. For code that already exists, code refactoring takes a lot of time - and for a sizable project, such as QEMU or gcc, where you have decades of accumulated code, it's nearly impossible to do without breaking something. So I'd say for large libraries - especially open-source ones - it has a lot to do with the difficulty in creating a detailed road map that would support such structuring. – tonysdg Sep 01 '15 at 16:16
  • 1
    **off topic: too broad and opinion based on speculation** –  Sep 01 '15 at 16:18
  • I don't really understand the premise... why not 1-2 or 2-1? I think that you should ask authors about the design decisions of particular things. What makes more sense is asking for (possible) technical reasons behind given solutions, but this is not what this question is about... – luk32 Sep 01 '15 at 16:56

4 Answers4

4

There is no inherent technical reason in C to provide .c and .h files in matched pairs. There is certainly no reason related to linking, as in conventional C usage, neither .c nor .h files has anything directly to do with that.

It is entirely possible and potentially advantageous to collect declarations related to multiple .c files in a smaller number of .h files. Providing only one or a small number of header files makes it easier to use the associated library: you don't need to remember or look up which header you need for the declaration of each function, type, or variable.

There are at least three consequence that arise from doing that, however:

  • you make it harder to determine where to find the implementations of functions declared in collective headers.
  • you make your project more susceptible to mass rebuilding cascades, as most object files depend on one or more of a small number of headers, and changes or additions to your function signatures all affect one of that small number of headers.
  • the compiler has to spend more effort digesting one large header with all the library's declarations than to digest one or a small number of headers focused narrowly on the specific declarations used in a given .c file, as @ChrisBeck observed. This tends to be much less of a problem for C code than it does for C++ code, however.
John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Sometimes declerations and implementaitons of the code is placed in `.h` files, what is the reason for that? – Nasr Sep 01 '15 at 16:59
  • the "mass rebuilding cascades" is a good point that I didn't think to mention – Chris Beck Sep 01 '15 at 17:01
  • Thanks for all the insight John. I can't quite envisage a situation in which a set of functions to be defined by a single .h are modular enough to warrant separate files, and yet not so distinct that a separate header is less preferable. Even if these functions are only needed by the given .h and won't be included elsewhere, I can't find issue with merely stacking the headers for organization, as opposed to saving the time of not doing so. If I'm not mistaken, this should also be easier to maintain. – TolkienWASP Sep 01 '15 at 17:25
  • @Nasr, in C, the conventional purpose of a header file is to provide for sharing declarations among multiple source files. If those source files are ever to contribute to the same program, then no header file used by more than one of them should contain any implementation code. I can imagine only very special -- and questionable -- purposes for putting implementation code in C header files. (C++ is an altogether different matter, however.) – John Bollinger Sep 01 '15 at 18:07
  • @TolkienWASP, fair enough, but if you go much beyond what I wrote in my answer then you end up in the regime of preference, opinion, habit, arbitrary standards, and "whatever we came up with at the time". I don't see much scope for useful discussion there. – John Bollinger Sep 01 '15 at 18:14
  • @JohnBollinger I like that explanation very much. Thanks! – TolkienWASP Sep 01 '15 at 19:16
3

You need a separate .h file only when something is included in more than one compilation unit.

A form of "keep things local unless you have to share" wisdom.

bobah
  • 18,364
  • 2
  • 37
  • 70
  • 1
    True, but I don't think that's what the question is asking. It seems to be about which declarations go into which header files, and how that relates to regular source files. – John Bollinger Sep 01 '15 at 16:55
1

In the past I'd been lead to believe that structuring code in this manner is optimal not only in organization, but also for linking purposes -- and I don't see how the lack of OOD features of C would effect this (not to mention not separating the implementation and interface also occurs in C++ libraries).

In traditional C code, you will always put declarations in the .h files, and definitions in the .c files. This is indeed to optimize compilation -- the reason is that, it makes each compilation unit take the minimum amount of memory, since it only has the definitions that it needs to output code for, and if you manage includes properly, it only has the declarations it needs also. It also makes it simple to see that you aren't breaking the one definition rule.

In modern machines its less important to do this from the perspective of, not having awful build times -- machines now have a lot of memory.

  • In C++ you have template files which are generally only in the header.
  • You also in recent years have people experimenting with so-called "Unity Builds" where you have one compilation unit which includes all of the other source files and you build it all at once. See here: The benefits / disadvantages of unity builds?

So today, having 1-1 correspondence is mainly a style / organizational thing.

Community
  • 1
  • 1
Chris Beck
  • 15,614
  • 4
  • 51
  • 87
  • Declarations without definitions, such as you will find in C header files, do not incur any requirement to produce corresponding code, and do not swell object files. They therefore have no effect on linking. C++ is a different in that C++ header files often do contain definitions that require code to be produced, but the question is about C. – John Bollinger Sep 01 '15 at 16:36
  • @JohnBollinger: "Declarations without definitions, such as you will find in C header files, do not incur any requirement to produce corresponding code, and do not swell object files." They still require memory of the compiler to know about them, while it is generating the code for the compilation unit. If you are compiling a large project on a machine with little memory, including all of the headers in all of the .c files can hurt compilation time significantly. Having separate headers for separate .c files helps you get this under control. – Chris Beck Sep 01 '15 at 16:37
  • 1
    "_This is indeed to optimize linking_" is pertinently untrue. Even if the .h file would contain code, it has no impact on _linking_. – Paul Ogilvie Sep 01 '15 at 16:37
  • Thanks, this is a good point, I will edit the answer. – Chris Beck Sep 01 '15 at 16:38
1

A really, really basic, but entirely realistic scenario where a 1-1 relation between .h and .c files is not required, and even not desirable:

main.h
//A lib's/extension/applications' main header file
//for user API -> obfuscated types
typedef struct _internal_str my_type;
//API functions
my_type * init_resource( void );//some arguments will probably be required
//get helper resource -> not part of the API, but the lib uses it internally in all translation units
const struct helper_str *get_help( void );

Now this get_help function is, as the comment says, not part of the libs' API. All the .c files that make up the lib are using it, though, and the get_help function is defined in the helper.c translation unit. This file might look something like this:

#include "main.h"
#include <third/party.h>
//static functions
static
third_party_type *init_external_resource( void )
{
    //implement this
}
static
void cleanup_stuff(third_party_type *p)
{
    third_party_free(p);
}
const struct helper_str *get_help( void )
{
    //implementation of external function
}

Ok, so it's a convenience thing: not adding another .h file, because there's only 1 external function you're calling. But that's no good reason not to use a separate header file, right? Agreed. It's not a good reason.

However: Imagine that your code depends on this third party library a lot, and each component of whatever you're building uses a different part of this library. The "help" you need/want from this helper.c file might differ. That's when you could decide to create several header files, to control the way the helper.c file is being used internally in your project. For example: you've got some logging-stuff in translation units X and Y, these files might include a file like this:

//specific_help.h
char * err_to_log_msg(int error_nr);//relevant arguments, of course...

Whereas a file that doesn't come near output, but, for example, manages thread-safety or signals, might want to call a function in helper.c that frees some resources in case some event was detected (signals, keystrokes, mouse events... whatever). This file might include a header file like:

//system_help.h
void free_helper_resources(int level);

All of these headers link back to functions defined in helper.c, but you could end up with 10 header files for a single c file.

Once you have these various headers exposing a selection of functions, you might end up adding specific typedefs to each of these headers, depending on how the two components interact... ah well, it's a matter of taste anyway.

Many people will just opt for a single header file to go with the helper.c file, and include that. They'll probably not use half of the functions they have access to, but they'll have less files to worry about.
On the other hand, if others start tinkering with their code, they might be tempted to add functions in a certain file that don't belong: they might add logging functions to the signal/event handling files and vice-versa

In the end: use your common sense, don't expose more than you need to. It's easy to remove a static keyword and just add the prototype to a header file if you really need to.

Elias Van Ootegem
  • 74,482
  • 9
  • 111
  • 149