12

I read a few posts and concluded that extern tells compiler that "This function exists, but the code for it is somewhere else. Don't panic." But how does the linker know where the function is defined.

My CASE:- I am working on Keil uvision 4. There is a header file grlib.h and the main function is in grlib_demo.c(it includes grlib.h). Now, there is a function GrCircleDraw() which is defined in Circle.c and called in grlib_demo.c, there is also a statement

extern void GrCircleDraw(all arguments);

in grlib.h. My query is how linker knows where the definition of GrCircleDraw() is since Circle.c is not included in grlib.h and grlib_demo.c

Note :- The files grlib.h and Circle.c are in the same folder. The code runs successfully.

Ankit Gupta
  • 185
  • 1
  • 6
  • 16

4 Answers4

10

When you compile a .o file in the ELF format, you have many things on the .o file such as:

  • a .text section containing the code;
  • .data, .rodata, .rss sections containing the global variables;
  • a .symtab containing the list of the symbols (functions, global variables and others) in the .o (and their location in the file) as well as the symbols used by the .o file;
  • sections such as .rela.text which are list of relocations -- these are the modifications that the link editor (and/or the dynamic linker) will have to make in order to link the differents parts of you program together.

On the caller side

Let's compile a simple C file:

extern void GrCircleDraw(int x);

int foo()
{
  GrCircleDraw(42);
  return 3;
}

int bla()
{
  return 2;
}

with:

gcc -o test.o test.c -c

(I'm using the native compiler of my system but it will work quite the same when cross-compiling to ARM).

You can look at the content of your .o file with:

readelf -a test.o

In the symbol table, you will find:

Symbol table '.symtab' contains 10 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND 
[...]
     8: 0000000000000000    21 FUNC    GLOBAL DEFAULT    1 foo
     9: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT  UND GrCircleDraw
    10: 0000000000000015    11 FUNC    GLOBAL DEFAULT    1 bla

There is one symbol for our foo functions and one for bla. The value field give their location within the .text section.

There is one symbol for the used symbol GrCircleDraw: it is undefined because this functions is not defined in this .o file but remains to be found elsewhere.

In the relocation table for the .text section (.rela.text) you find:

Relocation section '.rela.text' at offset 0x260 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000a  000900000002 R_X86_64_PC32     0000000000000000 GrCircleDraw - 4

This address is within foo: the link editor will patch the instruction at this address with the address of the GrCircleDraw function.

On the callee side

Now let's compile ourself an implementation of GrCircleDraw:

void GrCircleDraw(int x)
{

}

Let's look at it's symbol table:

Symbol table '.symtab' contains 9 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
[...]
     8: 0000000000000000     9 FUNC    GLOBAL DEFAULT    1 GrCircleDraw

It has an entry for GrCircleDraw defining its location within its .text section.

Linking them together

So when the link editor combines both files together it knowns:

  • which functions is defined in which .o file and their locations;
  • where in the code of the caller it must update with the address of the callee.
user47
  • 1,080
  • 11
  • 28
ysdx
  • 8,889
  • 1
  • 38
  • 51
  • 2
    Can't believe no upvotes until me just now. This is a very well-crafted answer and I for one am glad to see the ELF breakdown like this. – RastaJedi Mar 06 '16 at 07:05
9

The simple answer is that "the compiler doesn't need to know, but the linker has to be able to find it". Through multiple .o files, or through libraries, the linker has to be able to find a single definition of the GrCircleDraw function.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
  • Crap, I wrote Compiler by mistake, of course it has to be linker. But then linker has to know where the definition is in some form. In my code Circle.c (which includes the definition is not added anywhere). How is that possible that the linker still finds it? – Ankit Gupta Oct 14 '12 at 03:42
  • The linker has to find at least one definition of `GrCircleDraw`. Normally, it will ignore any extra definitions after the first, but it will object if the definitions are in two object files, or if you're linking with a static library and the object file in the library that contains the second definition is also needed to satisfy some other symbol; it will complain about double definition. – Jonathan Leffler Oct 14 '12 at 03:46
  • @JonathanLeffler There is no second definition. My query is how linker finds even that one definition. – Ankit Gupta Oct 14 '12 at 03:50
  • @AnkitGupta: If your code links successfully, then you have almost certainly informed the linker about `Circle.o` in some way. There's no magic here. – Greg Hewgill Oct 14 '12 at 03:52
  • @GregHewgill But, Circle.c is not even compiled, how can we have circle.o then – Ankit Gupta Oct 14 '12 at 03:59
  • 1
    @AnkitGupta: So you're telling us that: (1) There is a `circle.c` but it's not compiled; (2) the *only* definition of `GrCircleDraw` is in `circle.c`; and (3) your program compiles, links, and runs successfully. I don't believe all of those are true. In particular, I would suspect (2), and that there really is another definition of `GrCircleDraw`, perhaps in a library file that you're linking with. – Greg Hewgill Oct 14 '12 at 04:07
  • @GregHewgill But we can have only one definition of an extern function, since we have a definition in circle.c can't we conclude that there is no other definition? – Ankit Gupta Oct 14 '12 at 04:10
  • You haven't shown any evidence that you have any other definition (other than the claim that your program links successfully). The linker may allow you to supply two definitions, one in a *library* and one in an *object file*, and it will choose the one in the object file in preference to the library. But, that depends on your linker. – Greg Hewgill Oct 14 '12 at 04:16
  • @GregHewgill yes, the grlib.lib is included in my project. How can I check if this library file has definition of GrCircleDraw. – Ankit Gupta Oct 14 '12 at 04:21
  • 1
    Your linker should come with some utilities that help with things like that. I think the Visual C++ tool for that is called `dumpbin.exe` but you haven't said which compiler toolchain you are using. Another approach is to ask your linker to create a map file output, which shows a lot of detailed information about how the linker resolved your program. – Greg Hewgill Oct 14 '12 at 04:24
  • @AnkitGupta Well since `circle.c` *isn't part of your program* (assuming you didn't link circle.o) the compiler and linker don't care about it! You can only have one definition of `GrCircleDraw` *in the program*. – user253751 Jan 20 '16 at 07:01
5

The compiler is placing just the name of the extern function into the .obj file. Compiler does not need to know more about it.

When you start linking, it is your responsibility as a developer to give all necessary object files and library files to the linker. Linker will arrange all these functions into a binary. If you do not specify the right libraries or .obj files, the linking will simply fail with unresolved blah-blah.

Default libraries are typically included implicitly. This complicates things and creates illusions. You can always specify that you do not want any implicit libraries and include everything explicitly. Unfortunately every system does this in its own way.

Kirill Kobelev
  • 10,252
  • 6
  • 30
  • 51
  • the code in question is a sample code. All I need to know is how Linker knows where the definition of GrCircleDraw() is, since Circle.c is not included in any of the files. – Ankit Gupta Oct 14 '12 at 04:04
  • 1
    Since the prototype of your function is in `grlib.h`, most likely the body of this function is in `grlib.lib` or `grlib.a`. Look into log of your build or into the `.map` file. They might give you a clue. – Kirill Kobelev Oct 14 '12 at 04:07
0

Linking usually happens this way: The command line is iterated over and every argument given is

  1. used directly if it is an object file,
  2. used in the extent needed (=to fulfill all references which are unresolved till now).

At the end, every reference has to be fulfilled in order to successfully link. The order of lines given at the linker command line is important.

glglgl
  • 89,107
  • 13
  • 149
  • 217