39

I have three files, main.c, hello_world.c, and hello_world.h. For whatever reason they don't seem to compile nicely, and I really just can't figure out why...

Here are my source files. First hello_world.c:

#include <stdio.h>
#include "hello_world.h"

int hello_world(void) {
  printf("Hello, Stack Overflow!\n");
  return 0;
}

Then hello_world.h, simple:

int hello_world(void);

And then finally main.c:

#include "hello_world.h"

int main() {
  hello_world();
  return 0;
}

When I put it into GCC, this is what I get:

cc     main.c   -o main
/tmp/ccSRLvFl.o: In function `main':
main.c:(.text+0x5): undefined reference to `hello_world'
collect2: ld returned 1 exit status
make: *** [main] Error 1
pppery
  • 3,731
  • 22
  • 33
  • 46
user1018501
  • 493
  • 1
  • 4
  • 4
  • Related, but for C++ https://stackoverflow.com/questions/12573816/what-is-an-undefined-reference-unresolved-external-symbol-error-and-how-do-i-fix – Cody Gray - on strike Jun 30 '22 at 21:09
  • [Another question](https://stackoverflow.com/questions/72805371/gcc-cc-unable-to-compile-c-project-with-multiple-files-mac-os) was merged with this one, but that was [for a different compiler](https://meta.stackoverflow.com/questions/419027/was-my-question-correctly-closed-as-a-duplicate#comment942680_419027) ([Clang](https://en.wikipedia.org/wiki/Clang), not [GCC](https://en.wikipedia.org/wiki/GNU_Compiler_Collection)). ***Note***: Executable 'gcc' ***on [macOS](https://en.wikipedia.org/wiki/MacOS)*** usually means Clang ('gcc' is aliased to the Clang compiler), not GCC. – Peter Mortensen Jan 10 '23 at 12:43

8 Answers8

57
gcc main.c hello_world.c -o main

Also, always use header guards:

#ifndef HELLO_WORLD_H
#define HELLO_WORLD_H

/* header file contents go here */

#endif /* HELLO_WORLD_H */
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    Though the header guards are unnecessary (in this example) it's a good hint – KevinDTimm Apr 27 '12 at 20:20
  • So the different .c files are compiled separately to produce one executable? – redpix_ Dec 26 '14 at 18:08
  • 1
    The header guards ensure that multiple C files including the same H file don't run into issues of declaring/defining the same identifiers several times in the same program. The compiler works with "translation units", which is a C file plus all the headers included by that C file. Meaning that the same H file could exist in multiple translation units. – Lundin Mar 01 '17 at 08:31
  • Or we can use #pragma once but its **non-standard** but widely supported – Haseeb Mir Sep 11 '21 at 02:17
  • @HaseeBMir There exists no sound reason why you would ever use a non-standard feature which is 100% equivalent to an existing standard feature. – Lundin Sep 11 '21 at 11:41
  • On Visual Studio its by default so we dont have to add Header guards to every .h files out there. Yes i am not recommending this to use just stating it out – Haseeb Mir Sep 11 '21 at 13:35
  • What is the point of the header file if I need to put the .c file in the compile command? – mLstudent33 Sep 29 '22 at 11:24
  • @mLstudent33 As far as the compiler is concerned, you could just write everything in a single file and hack away into one big, unreadable mess. The header contains function declarations and documentation about their use for the benefit of _the programmer_. They are required to make structured, autonomous code modules with a well-defined interface. The very same reasons as why we use several files and not just one: organisation. – Lundin Sep 29 '22 at 12:49
10

You are not including file hello_world.c in the compilation. Use:

gcc hello_world.c main.c  -o main
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
P.P
  • 117,907
  • 20
  • 175
  • 238
6

You are not linking against hello_world.c.

An easy way to do this is to run this compilation command:

cc -o main main.c hello_world.c

More complicated projects often use build scripts or make files that separate the compilation and linking commands, but the above command (combining both steps) should do fine for small projects.

Tom
  • 18,685
  • 15
  • 71
  • 81
4

You should link the object file compiled from your second .c file, hello_world.c with your main.o file.

Try this:

cc -c main.c
cc -c hello_world.c
cc *.o -o hello_world
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
vard
  • 2,142
  • 4
  • 30
  • 40
2

This is a good time to learn about the concept of translation units.

The compiler only deals with one single translation unit at a time. It will not know anything about other possible translation units, and it's the job of the linker to put all translation units together.

Your command you use to build your program:

gcc main.c -o out

That only compiles and attempts to link one of the two translation units–the one created from main.c. The translation unit from hello_world.c isn't used at all.

You need to pass both source files for the front-end program gcc to build and link both of them:

gcc main.c hello_world.c -o out
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • So, is it standard practice to just put `gcc *.c -o out` in your makefile or something? Seems like it would scale badly to large codebases (especially considering the complexity of compiling things in subdirectories ...) – Richard Rast Jun 29 '22 at 17:19
  • 2
    No, makefile writing is a separate "art".. – Eugene Sh. Jun 29 '22 at 17:20
  • @RichardRast A makefile, hand-written or auto-generated, lists each single object file and the source and header file they depend on. Then `make` (or other build-system) only (re)build the object files that depend on changed source/header files. Then it's set up to link all the object files. – Some programmer dude Jun 29 '22 at 17:21
  • This is exactly *not* the time to talk about translation units because you do everything in one single big command here. Modern compilers can and will blur the lines of translation units (e.g. perform whole program optimization, inline code from one file into the other) when you pass them in one command. If you want to talk about translation units and compile vs. link stage, then present a compile stage of single files with -c and a link stage. – Peter - Reinstate Monica Jun 29 '22 at 17:21
  • @Peter-ReinstateMonica is there a standard "modern compiler" which does that out of the box? I'm used to `javac` and `rustc` which do exactly that; I understand c is quite a bit lower level but I have to assume modern programmers want modern tools anyway? – Richard Rast Jun 29 '22 at 17:26
  • @RichardRast `gcc` *is* a "standard modern compiler", and the concept of translation units is still relevant and important. – Eugene Sh. Jun 29 '22 at 17:28
  • @EugeneSh. Sure, my use of the word "compiler" is inappropriate here. I would assume you would use some kind of simple wrapper around it in practice for large projects though, which handles feeding all relevant files into it. Surely people aren't fiddling with their build scripts every time they add a file to a project. – Richard Rast Jun 29 '22 at 17:29
  • @RichardRast from https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html: "Compiling **multiple files at once to a single output file** mode allows the compiler to use information gained from all of the files when compiling each of them." – Peter - Reinstate Monica Jun 29 '22 at 17:31
  • 1
    @RichardRast Of course, if you use some kind of IDE which is managing your project, you might ignore these things most of the time. But in practice, especially for large projects which are being developed by multiple/many developers with different personal preferences, the project is not tied to a specific IDE, but is using some standard build system (make/cmake based) – Eugene Sh. Jun 29 '22 at 17:33
1

Yes, it seems you have forgotten to link hello_world.c.

I would be using gcc hello_world.c main.c -o main. If the number of files are less, we can use this approach, but in larger projects it is better to use Make files or some compilation scripts.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
john
  • 1,323
  • 2
  • 19
  • 31
1

(I briefly look at the building blocks of C programs and then examine the build steps normally hidden behind gcc calls.)

Traditional compiled languages, for example C and C++, are organized in source files which normally are, one by one, "compiled" into one "object file" each. Each source file is one "translation unit" — after all the include directives have been processed. (Therefore, a translation unit typically consists of more than one file, and the same include file typically occurs in more than one translation unit — files and translation units have, strictly spoken, an n:m relation. But pragmatically one can say that a "translation unit" is a C file.)

To compile a single source file into an object file, one passes the -c flag to the compiler:

gcc -c myfile.c

This creates myfile.o, or perhaps myfile.obj, in the same directory.

Object files contain machine code and data (and potentially debug information, but we ignore that here). The machine code contains functions, and the data comes in the shape of variables. Both functions and variables in the object files have names which are called "symbols". The compiler typically transforms the variable and function names in the program by prepending an underscore or the like, and in C++ the generated ("mangled") name contains information about the type and, for functions, parameters.

Some symbols, for example the names of global variables and normal functions, are usable from other object files; they are "exported".

A symbol can, with only slight simplification, be thought of as an address alias: For a function, the name is an alias for the target address of a jump; for variables, the name is an alias for the address of a memory location from which the program can read and to which it can write.

Your file help.c contains the code for the function herp. Functions in C have by default "external linkage", they can be used from other translation units. Their name — the "symbol" — is exported.

In modern C, a source file using a name defined in a different translation unit must declare the name. This tells the compiler what to do with it, and in which ways it can syntactically be used in the source code (e.g., call a function, assign to a variable, index an array). The compiler produces code that reads from this "symbolic address" or jumps to that "symbolic address"; it is the linker's job to replace all those symbolic addresses with "real" memory locations that point to existing data and code in the final executable, so that the jumps and memory accesses are landing at the desired locations.

The declaration of a name (function, variable) in the file that's using it can be "manual", like void herp();, appearing directly in your file before the first use. More typically though, the names defined in a translation unit that other translation units can use are declared in a header file, your helper.h. The using translation unit uses the "canned" declarations in the header file by #include-ing it. There is no magic here; an include directive simply inserts the include file text as if it were written in the file directly. There is exactly zero difference. In particular, including a header file does not tell the linker to link with the corresponding source file. The reason is simple: The linker never knows about the included file because that piece of knowledge is erased during the compilation into an object file.

This means in your case that help.c must be compiled, and that the linker must be told to combine ("link") it with the rest of the program, in your case the code from the compilation of main.c.

The discussion how that is done is a bit more difficult because this procedure is so common that the typical C compiler integrates compilation and link stage: gcc -o myprog help.c main.c simply does everything necessary to create an executable myprog.

When we say "compiler", e.g. referring to gcc, we normally actually mean the "compiler driver" which takes the commands and files from the command line and performs the necessary steps to achieve the desired results, like producing an executable program from our sources. The actual compiler for gcc is cc1 which produces an assembly file which must be "assembled" with as into an object file. After the source files are compiled, gcc calls the linker with the appropriate options, which produces the executable.

Here is a sample session detailing the stages:

$ ls
Makefile  help.c  help.h  main.c

$ /lib/gcc/x86_64-pc-cygwin/7.4.0/cc1 main.c
 main
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes> <targetclone> <free-inline-summary> <emutls> <whole-program> <inline>Assembling functions:
 <materialize-all-clones> <simdclone> main
Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 (22%) wall    1184 kB (86%) ggc
 TOTAL                 :   0.00             0.00             0.01               1374 kB

$ ls
Makefile  help.c  help.h  main.c  main.s

$ /lib/gcc/x86_64-pc-cygwin/7.4.0/cc1 help.c
 herp
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> <visibility> <build_ssa_passes> <opt_local_passes> <targetclone> <free-inline-summary> <emutls> <whole-program> <inline>Assembling functions:
 <materialize-all-clones> <simdclone> herp
Execution times (seconds)
 phase setup             :   0.01 (100%) usr   0.00 ( 0%) sys   0.00 (33%) wall    1184 kB (86%) ggc
 TOTAL                 :   0.01             0.00             0.01               1370 kB

$ ls
Makefile  help.c  help.h  help.s  main.c  main.s

We now have two assembly files, main.s and help.s, which can be assembled into object files with the assembler as. But let's have a quick look at help.s:

$ cat help.s

        .file   "help.c"
        .text
        .globl  some_variable
        .data
        .align 4
some_variable:
        .long   1
        .text
        .globl  herp
        .def    herp;   .scl    2;      .type   32;     .endef
        .seh_proc       herp
herp:
        pushq   %rbp
        .seh_pushreg    %rbp
        movq    %rsp, %rbp
        .seh_setframe   %rbp, 0
        .seh_endprologue
        nop
        popq    %rbp
        ret
        .seh_endproc
        .ident  "GCC: (GNU) 7.4.0"

Even if we know nothing about assembler we can clearly identify the symbols some_variable and herp, which are assembly labels.

Ah yes, I forgot that I added a variable definition to help.c:

$ cat help.c

#include "help.h"
int some_variable = 1;
void herp() {}

We can assemble the assembly files with the assembler as:

$ as main.s -o main.o

$ ls
Makefile  help.c  help.h  help.s  main.c  main.o  main.s

$ as help.s -o help.o

$ ls
Makefile  help.c  help.h  help.o  help.s  main.c  main.o  main.s

Now we have two object files. We can see which symbols are exported ("extern") or needed ("undefined") with the utility nm("name mangling"):

$ nm --extern-only help.o

0000000000000000 T herp
0000000000000000 D some_variable

$ nm --extern-only main.o
                 U __main
                 U herp

"T" indicates that a symbol is in the "text" section, which contains code; "D" is the data section, and "U" stands for "undefined". (The undefined __main is a gcc and/or cygwin quirk.)

Here you have the source of your problem: Unless you pair your main.o with an object file that defines that undefined symbol, the linker cannot "resolve" the name and cannot produce the jump. There is no jump destination.

Now we can link the two object files to an executable. Cygwin requires us to link against the cygwin.dll; sorry for the circumstance.

$ ld main.o help.o /bin/cygwin1.dll -o main

$ ls
Makefile  help.c  help.h  help.o  help.s  main*  main.c  main.o  main.s

That's about it. I should add that the program doesn't run properly. It doesn't end, and doesn't react to Ctrl-C; I may be missing some Gnu or Windows build intricacies that gcc does for us.

Ah, Makefiles. Makefiles consist of target definitions and dependencies of these targets: A line

main: help.o main.o

specifies a target "main" depending on the two .o files. Makefiles normally also contain rules specifying how to produce a target. But Make has built-in rules; it knows that you call the compiler to produce an .o file from a .c file (and it automatically considers this dependency), and it knows that you link the o files together to produce the target depending on them, provided the target has the same name as one of the .o files.

Therefore, we don't need any rules: We simply define the non-implicit dependencies. The entire Makefile for your project boils down to:

$ cat Makefile

CC=gcc
main: help.o main.o
help.o: help.h
main.o: help.h

CC=gcc specifies the C compiler to use. CC is a built-in make variable specifying the C compiler (CXX would specify the C++ compiler, e.g. g++).

Let's see:

$ make

gcc    -c -o main.o main.c
gcc    -c -o help.o help.c
gcc   main.o help.o   -o main
$ ls
Makefile  help.c  help.h  help.o  main.c  main.exe*  main.o

Do the dependencies work?

$ make
make: 'main' is up to date.

$ touch main.c

$ make
gcc    -c -o main.o main.c
gcc   main.o help.o   -o main

$ touch help.h

$ make
gcc    -c -o main.o main.c
gcc    -c -o help.o help.c
gcc   main.o help.o   -o main

That looks good: after touching a single source file make compiles only that file; but touching the header on which both files depend makes make compile both. The linking needs to be done in any case.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Peter - Reinstate Monica
  • 15,048
  • 4
  • 37
  • 62
  • Re *"The actual compiler for gcc is cc1"*: It is [different on macOS](https://stackoverflow.com/questions/10357117/c-header-issue-include-and-undefined-reference#comment132475484_10357117) (a common [source of confusion](https://stackoverflow.com/questions/38840601/how-can-i-ignore-an-error-when-using-gcc-compile-option-werror#comment130977715_38840652)). – Peter Mortensen Jan 10 '23 at 12:52
0

You need to tell the compiler that your project contains two source files:

gcc -o out main.c hello_world.c

Also at your level, I would suggest to use an IDE (for example, Eclipse CDT) to focus on programming, not building. Later you will learn how to build complicated projects. But now simply learn to program.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
0___________
  • 60,014
  • 4
  • 34
  • 74