0

I am a bit confused about header files.

My understanding of header files in C, such as #include <windows.h>, is that only the necessary parts are included based upon whatever functions are used in the program. For example, if only the MessageBox() function was implemented in the source code, then just the necessary parts would be included from the header file.

However, I have stumbled upon WindowsHModular on GitHub here which claims to allow the programmer to only include what is required, as the GitHub author has split Windows.h into various modules.

It seems like quite a contradiction, so I was hoping someone could help me get my facts straight.

securityauditor
  • 293
  • 3
  • 13
  • @pmacfarlane What if I included another header, such as but never used any functions like printf() in my source file? – securityauditor Feb 20 '23 at 22:41
  • 2
    This **must** be a duplicate question... Header files _declare_ tokens and function prototypes so the compiler knows certain values and can report incorrect usage. The subsequent **linking** is where code is drawn from libraries and 'linked' together with your program. C does not _import_ code from header files. – Fe2O3 Feb 20 '23 at 22:46
  • Including a header file is nearly like replacing the `#include` directive with the full text of the included file. see https://www.cprogramming.com/reference/preprocessor/include.html or https://en.wikibooks.org/wiki/C_Programming/Preprocessor_directives_and_macros##include – Bodo Feb 20 '23 at 22:47
  • @Fe2O3 What is a token? – securityauditor Feb 20 '23 at 22:48
  • `MessageBox()` isn't *implemented* in a header file. The header file contains a declaration only, so that doesn't generate any code. It's down to the linker to populate the import tables for every import used. – IInspectable Feb 20 '23 at 22:49
  • "_What is a token?_" demonstrates a need for you to sit down with a good book on the language and begin to learn for yourself. – Fe2O3 Feb 20 '23 at 22:50
  • @Fe2O3 Tokens were not covered in Head First C nor The C Programming Language 2nd edition... By the looks of things, a token could be any of the following:  keyword  identifier  constant  string-literal  operator  punctuator – securityauditor Feb 20 '23 at 22:52
  • @IInspectable Do you have a link on a brief explanation of how import tables for imports work? – securityauditor Feb 20 '23 at 22:52
  • 1
    [Index to the series on DLL imports and exports](https://devblogs.microsoft.com/oldnewthing/20060727-04/?p=30333). – IInspectable Feb 20 '23 at 22:55
  • K&R C Programming Language 1st edition... Chapter 12.1... "**Token Replacement**" – Fe2O3 Feb 20 '23 at 22:57
  • @Fe2O3 Okay, then I stand corrected, sorry. It slipped my mind, the term "token" is not something that I come across daily when programming, it is handled behind the scenes. I have not come across it in any documentation when using C standard library stuff, nor Windows API stuff. EDIT I read the 2nd edition of K&R. – securityauditor Feb 20 '23 at 22:59

2 Answers2

2

No,

if you include file the line

#include <file.h> // or "file.h"

is replaced by the content of the file.h

Example: https://godbolt.org/z/a3WEP6hdP

Header files are not libraries or object files. When you link the linker will link only the used function (more precisely all functions from the used segments).

But would my final executable file size be smaller if I used WindowsHModular? Surely including <windows.h> with all of its hundreds, or maybe thousands, or lines is going to bloat the executable?

The correct .h file does not define any data or functions. It should contain only macro definitions, data types declarations, extern object declarations and function prototypes. .h file can be hundreds of thousands of lines long but it will not add anything to the executable.

0___________
  • 60,014
  • 4
  • 34
  • 74
  • So you are saying, even though #include puts everything from file.h into my test.c file, the linker will remove the un-necessary lines added by file.h into test.c? – securityauditor Feb 20 '23 at 23:15
  • Those lines do not add any code to your executable. – 0___________ Feb 20 '23 at 23:26
  • You said #include , which I have used as an example, will get replaced with thousands of lines from within the windows.h file correct? Surely this will increase the size of the executable? – securityauditor Feb 20 '23 at 23:29
  • 2
    @securityauditor "surely" it will not. – 0___________ Feb 20 '23 at 23:30
  • @securityauditor: It adds **nothing** to your executable file. The size of the executable is determined by the linker, which will only include code that is actually used in your application. The header file can define thousands of functions, but if your code does not actually call those functions, they will not be linked into your application. – Ken White Feb 21 '23 at 01:10
  • @KenWhite it is not actually 100% correct. `ld` can discard on the section level, not function level. So if functions in libraries and object files are placed in own sections, then it will work as you describe. If not, then it will be enough one function from section to be used for the whole section placement. So when compile always place functions in their own sections – 0___________ Feb 21 '23 at 10:05
  • 1
    *"Sections"* aren't a thing in the C language specification. It's `ld`'s concession to the revelation that the attempt to scale infinity (*"every symbol in the open set is public by default"*) don't scale. @KenWhite's comment is accurate from the perspective of a client using a linker that's built on sane principles (like MSVC's `link.exe`). – IInspectable Feb 21 '23 at 16:28
2

My understanding of header files in C, such as #include <windows.h>, is that only the necessary parts are included based upon whatever functions are used in the program.

I guess it depends on what you mean by "included". Possibly you are confusing header inclusion with linking, which is a completely separate stage later in the compilation process.

At a high level, #include directives are simple. They direct the compiler to read source code from another file, as if it appeared in the place of the directive. That's it. There is no inherent picking and choosing of different pieces. Consider: how would the compiler know what pieces you need before it processes the rest of the source file?

There's no fundamental difference between header files and "regular" source files, but it has become conventional wisdom and very strong custom that only certain kinds of code will be put into headers, primarily:

  • function declarations
  • macro definitions
  • struct, union, and enum type definitions
  • external variable declarations
  • typedef definitions

These are mostly things that need to be declared identically in multiple independent source files, and putting them in header files both facilitates that and makes maintenance a lot easier when one of them needs to change.

These are also things that do not affect the program if they go unused. For example, the resulting program is not larger or more complex if the source declares functions that it never calls, whether by #includeing a header or by declaring them directly.

However, I have stumbled upon WindowsHModular on GitHub here which claims to allow the programmer to only include what is required, as the GitHub author has split Windows.h into various modules.

The problem with Windows.h is that it is huge and complex. Although that doesn't make a difference to compiled programs, it does make the compiler expend a fair amount of effort. The purpose of splitting Windows.h into separate modules is to reduce the CPU time and memory required to compile programs by allowing you to omit a bunch of declarations that you didn't need anyway.

You're probably better off ignoring WindowsHModular at this point.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • But would my final executable file size be smaller if I used WindowsHModular? Surely including with all of its hundreds, or maybe thousands, or lines is going to bloat the executable? – securityauditor Feb 20 '23 at 23:23
  • No, @securityauditor, as I already wrote. The more like tens of thousands of lines of `Windows.h` and the other headers it itself includes do not contribute one whit to executable size. There is nothing executable within, only declarations. Declarations instruct the compiler how to interpret other code, but they do not have runtime behavior of their own. – John Bollinger Feb 20 '23 at 23:28
  • @securityauditor Declarations don't generate any code. They merely introduce symbols for the compiler. And definitions get stripped out by the linker if they aren't being used. With MSVC's linker defaulting to having symbols private by default, there's a lot of purging opportunity. – IInspectable Feb 20 '23 at 23:29
  • 1
    *`it does make the compiler expend a fair amount of effort.`* does anyone care nowadays? – 0___________ Feb 20 '23 at 23:32
  • Beats me, @0___________. Clearly the author of WindowsHModular did at one time, and perhaps they still do, since the project has been updated within the last year. – John Bollinger Feb 20 '23 at 23:34
  • @JohnBollinger there are zmilions useless live project – 0___________ Feb 20 '23 at 23:35
  • 1
    Yes, @0___________, but you didn't ask whether it was *useful*. You asked whether anyone cares. You will note that my recommendation to the OP was to ignore WindowsHModular. My point about compiler effort was to explain the motivation for such a project, not to impute usefulness to it. – John Bollinger Feb 20 '23 at 23:39
  • @JohnBollinger my point was to show OP that it does not matter how many lines your .h file is. – 0___________ Feb 20 '23 at 23:53
  • @securityauditor If you do care about compile times, you can use preprocessor symbols to limit the API surface the compiler will have to process (see [Faster Builds with Smaller Header Files](https://learn.microsoft.com/en-us/windows/win32/winprog/using-the-windows-headers#faster-builds-with-smaller-header-files) for instructions). – IInspectable Feb 20 '23 at 23:56