32

Reading the fine print of the -I switch in GCC, I'm rather shocked to find that using it on the command line overrides system includes. From the preprocessor docs

"You can use -I to override a system header file, substituting your own version, since these directories are searched before the standard system header file directories."

They don't seem to be lying. On two different Ubuntu systems with GCC 7, if I create a file endian.h:

#error "This endian.h shouldn't be included"

...and then in the same directory create a main.cpp (or main.c, same difference):

#include <stdlib.h>
int main() {}

Then compiling with g++ main.cpp -I. -o main (or clang, same difference) gives me:

In file included from /usr/include/x86_64-linux-gnu/sys/types.h:194:0,
                 from /usr/include/stdlib.h:394,
                 from /usr/include/c++/7/cstdlib:75,
                 from /usr/include/c++/7/stdlib.h:36,
                 from main.cpp:1:
./endian.h:1:2: error: #error "This endian.h shouldn't be included"

So stdlib.h includes this types.h file, which on line 194 just says #include <endian.h>. My apparent misconception (and perhaps that of others) was that the angle brackets would have prevented this, but -I is stronger than I'd thought.

Though not strong enough, because you can't even fix it by sticking /usr/include in on the command line first, because:

"If a standard system include directory, or a directory specified with -isystem, is also specified with -I, the -I option is ignored. The directory is still searched but as a system directory at its normal position in the system include chain."

Indeed, the verbose output for g++ -v main.cpp -I/usr/include -I. -o main leaves /usr/include at the bottom of the list:

#include "..." search starts here:
#include <...> search starts here:
 .
 /usr/include/c++/7
 /usr/include/x86_64-linux-gnu/c++/7
 /usr/include/c++/7/backward
 /usr/lib/gcc/x86_64-linux-gnu/7/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include

Color me surprised. I guess to make this a question:

What legitimate reason is there for most projects to use -I considering this extremely serious issue? You can override arbitrary headers on systems based on incidental name collisions. Shouldn't pretty much everyone be using -iquote instead?

  • 7
    Not a flaw. By design. Sometimes (especially for larger projects) a developer needs to interdict one or a few system files or possibly standard header files. – Eljay Nov 05 '18 at 13:08
  • 1
    @HostileFork if you leave the amendment in the comment, it may not be seen. – ctrl-alt-delor Nov 05 '18 at 13:16
  • 3
    @Eljay It’s still a flaw. It’s widely recognised that the C and C++ system of handling dependencies (and generally multi-file compilations) is *hugely* flawed. That’s why people have been trying to replace it with a modern module system for ages. – Konrad Rudolph Nov 05 '18 at 17:23
  • 2
    @KonradRudolph • The overall design of includes and dependencies is a historical legacy. By today's expectations, I agree... flawed. The `-I` is part of the historical legacy, designed when memory and drive space were minuscule in comparison with today's hardware. For C++20, I hope modules, contracts, and concepts lite all make it into the standard. Modules will be a game changer for how C++ software is developed, yet done with backwards compatibility that will linger for a long time. – Eljay Nov 05 '18 at 17:39
  • 11
    You can slit your wrist with a sharp knife and die. So why do cooks still use sharp knives to prepare food? `-I` overrides system includes, because sometimes you *need* to override them. There's nothing else to add. – alephzero Nov 05 '18 at 19:04
  • @alephzero My point isn't that this might not be an interesting feature for those who need it, just that the majority of projects which I've ever seen that use `-I` would explicitly not want this behavior. Finding out that POSIX once chose to have only one include behavior, I can see how if you're going to have only one you'd err on the side of powerfulness...there may have been a time where very few projects needed to add extra include directories or were expected to use absolute paths. But since -iquote seems to be a 13-year-old-feature now, it might be time for most to switch. – HostileFork says dont trust SE Nov 05 '18 at 22:05

4 Answers4

33

What legitimate reasons are there for -I over -iquote? -I is standardized (at least by POSIX) while -iquote isn't. (Practically, I'm using -I because tinycc (one of the compilers I want my project to compile with) doesn't support -iquote.)

How do projects manage with -I given the dangers? You'd have the includes wrapped in a directory and use -I to add the directory containing that directory.

  • filesystem: includes/mylib/endian.h
  • command line: -Iincludes
  • C/C++ file: #include "mylib/endian.h" //or <mylib/endian.h>

With that, as long as you don't clash on the mylib name, you don't clash (at least as far header names are concerned).

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • 1
    I also have a project I'd like to be able to build with TCC, so thanks for mentioning that. Though while -iquote may not be in POSIX, it has been in gcc since v4 released in 2005 (and is supported by all versions of clang, I gather)...so if anyone's project relies on command line settings available since 13 years ago, they might consider -iquote. – HostileFork says dont trust SE Nov 07 '18 at 16:07
19

Looking back at the GCC manuals it looks like -iquote and other options were only added in GCC 4: https://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Directory-Options.html#Directory%20Options

So the use of "-I" is probably some combination of: habit, lazyness, backwards compatibility, ignorance of the new options, compatibility with other compilers.

The solution is to "namespace" your header files by putting them in sub directories. For example put your endian header in "include/mylib/endian.h" then add "-Iinclude" to the command line and you can #include "mylib/endian.h" which shouldn't conflict with other libraries or system libraries.

Alan Birtles
  • 32,622
  • 4
  • 31
  • 60
16

I this your premise that it's -I that's dangerous is false. The language leaves the search for header files with either form of #include sufficiently implementation-defined that it's unsafe to use header files that conflict with the names of the standard header files at all. Simply refrain from doing this.

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
  • 5
    It goes beyond conflict with *"standard"* header files; as it seems it can be any internal implementation file (such as those in the `/usr/include/x86_64-linux-gnu/sys/` directory...) it seems this could be pretty much any name. `debugreg.h`? How can you know to avoid all possible such names for your includes? – HostileFork says dont trust SE Nov 05 '18 at 16:03
  • 4
    The headers in `sys/*` are not "internal". They're extensions defined by POSIX or by the particular operating system, and you shouldn't need to care about masking them with `-I` unless your application needs to use them for something. In any case, naming your headers sys/anything seems like a really bad idea. – R.. GitHub STOP HELPING ICE Nov 05 '18 at 16:33
  • How can you know? You learn the standard ones. Or you use a profiling tool which knows them. You turn on all the error reporting you can so if there is a naming collision it breaks the build until you fix it. Many ways. – Baldrickk Nov 05 '18 at 16:35
  • 1
    @Baldrickk `sys/` is a well-known header namespace used by POSIX. In any case, `c99 -xc -E - -v &1 | awk '/#include ,/End of/ { print $0 }'|sed -n 's/^ //p'|while read d; do printf '%s\n' $d/*/; done |sed 's|/$||; s|.*/||' |sort -u ` should give you those). – Petr Skocik Nov 05 '18 at 17:04
  • 1
    @PSkocik I wasn't recommending that you pollute the namespace, just saying that there are ways to make sure you are not colliding /potentially colliding with anything else. – Baldrickk Nov 06 '18 at 10:07
11

An obvious case is cross-compilation. GCC suffers a bit from a historical UNIX assumption that you're always compiling for your local system, or at least something that's very close. That's why the compiler's header files are in the system root. The clean interface is missing.

In comparison, Windows assumes no compiler, and Windows compilers do not assume you're targeting the local system. That's why you can have a set of compilers and a set of SDK's installed.

Now in cross-compilation, GCC behaves much more like a compiler for Windows. It no longer assumes that you intend to use the local system headers, but lets you specify exactly which headers you want. And obviously, the same then goes for the libraries you link in.

Now note that when you do this, the set of replacement headers is designed to go on top of the base system. You can leave out headers in the replacement set if their implementation would be identical. E.g. chances are that <complex.h> is the same. There's not that much variation in complex number implementations. However, you can't randomly replace internal implementation bits like <endian.h>.

TL,DR : this option if for people who know what they're doing. "Being unsafe" is not an argument for the target audience.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 2
    When cross-compiling you typically invoke a different compiler that uses a different directory than /usr/include and so -I isn't any more necessary than in the single instance. – Joshua Nov 05 '18 at 16:44
  • If a project takes control over what version of the standard libraries are linked in, it should generally also take control over the location of standard header files, to ensure that any definitions in the included headers match those expected by the standard library. – supercat Nov 05 '18 at 22:43