how linker decides which implementation use

Question

Let's say we have the following files:

//foo.h
class Foo
{
public:
    void foo()
    {
        //Great code here
    }    
};

//foo1.cpp
#include "foo.h"

void Foo1()
{
    Foo f1;
    f1.foo();
}

//foo2.cpp
#include "foo.h"

void Foo2()
{
    Foo f2;
    f2.foo();
}

When I compile them separated, they generate two objects: Foo1.o Foo2.o. When I link them together they link perfectly.

Now if I dump symbols table for both, the seems to implement Foo::foo function in the two compilation units.

_ZN3Foo3fooEv

Now, how does the linker distinguish which implementation to use?

Which implementation to use of what? `Foo1` and `Foo2`? – Alexander Shukaev Dec 09 '13 at 08:50 — Alexander Shukaev, Dec 09 '13 at 08:50

Mats Petersson · Answer 1 · 2013-12-09T08:59:42.080

2

Since the code in Foo::foo() is identical [if it's not, you are breaking the "one definition rule" - that is, one function should have ONE definition, no matter how many times it is actually defined].

So, the compiler/linker should be perfectly allowed to merge the two identical functions into one when completing your executable file.

Note however, that as it stands, Foo::foo() is declared as an inline function, which means that it's not "exported" to the outside world, and there should be no conflict.

If you were to "manually include" the class definition for Foo in both of your foo1.cpp and foo2.cpp, and make some subtle difference in the functions, you would find that the linker "picks" one of the functions, and discards the other one. Which it picks is not defined, and since the "one definition rule" has been broken, you are "outside the bounds of what you should do", so no point in complaining that the compiler "doesn't do the right thing". [Although you would have to make the function "non-inline" to make this a problem, and then you'd probably get a linker error for multiple definitions].

edited Dec 09 '13 at 08:59

answered Dec 09 '13 at 08:51

Mats Petersson

126,704
14
140
227

1

Difference in what functions? He has 2 different functions defined in 2 different translation units, and a class defined in header. So I don't even see the point of author's question nor in your answer. – Alexander Shukaev Dec 09 '13 at 08:54
@Haroogan: The question [as I read it] is "if you then link foo1.o and foo2.o into an executable unit, which of the `Foo::foo()` functions present in the foo1.o and foo2.o gets used?" [assuming inlining is turned off, one must assume] - but of course, since they are identical functions, it won't matter. – Mats Petersson Dec 09 '13 at 08:57
The OP is asking why both copies (translational units) are using two different copies of `Foo::foo`. Its like you said, "He has 2 different function defined in 2 different translation units", thus each using its own copy. – zackery.fix Dec 09 '13 at 08:57
1

@Felipe: Your code as you have written it above implies inline by having the function definition inside the class. If on the other hand you move the definition of `Foo::foo()` outside of the class, you will have "multiple definition" problems. – Mats Petersson Dec 09 '13 at 09:01
@Mats Unless the function moved outside the class is implmented inside its own translation unit, and linked with all other object files. – zackery.fix Dec 09 '13 at 09:05
1

Ok. Let's get this straight. By inline I understand that the code of the function is directly inside the calling function and not an actual processor call (by example. "call" asm in x86). The objdump show that a symbol is created _ZN3Foo3fooEv and a "call" asm instruction is used to jmp, so at least for gcc, inline is not implied. – felknight Dec 09 '13 at 09:07
@Felipe Yep, there is no inlined code in your example. The compiler will generate all object files having Foo# functions calling that object files copy of `Foo::foo()`. Did you dump the .exe to see the symbols there? I would assume two copies of a Foo::foo function, but I could be wrong. – zackery.fix Dec 09 '13 at 09:12
I did what @Mats suggested. I manually included the definition of class Foo in foo2.cpp and slightly changed it's method. What I got was that foo2.o Foo::foo code was fully ignored from final exe and Foo.o version was used for both the functions Foo1 and Foo2 – felknight Dec 09 '13 at 09:19
1

@Felipe: your understanding of "inline" is incorrect. See http://stackoverflow.com/a/157929/13005, although beware that there are some incorrect answers to that question, including the accepted one. What you describe is what happens when *a particular call to a function* is inlined (which happens at the compiler's discretion). This is pretty much completely unrelated to whether *the function itself* is inline (which happens if it's marked `inline` or if it's defined in a class). The use of the same word for both is for historical reasons: compilers used to relate the two things. – Steve Jessop Dec 09 '13 at 10:17

score 2 · Accepted Answer · answered Dec 09 '13 at 10:18

Mats Petersson's answers entirely correct, but I'll spin this in my own words with different coverage....

When you compile C++ code, you compile it a translation unit at a time... each translation unit typically consists of one implementation file (e.g. .cpp/.cc or whatever you've chosen to name it) and the header files it includes, and the compiler produces one .o file. When the compiler sees your foo.h and the definition for Foo::foo(), it will consider it a nominally inline function because the function body appears inside the class. As such, the compiler may or may not actually inline the function at points of call - that decision will depend upon the size/complexity of the function and the compiler's heuristics and options. So, Foo::foo may still end up as separate out-of-line functions in the .os for both translation units.

Because the function's nominally inline, the compiler needs to make sure that the symbol is marked as a "weak symbol" (exact terminology may differ by OS/toolchain - this is implementation detail below the level of the C++ Standard) - see http://en.wikipedia.org/wiki/Weak_symbol

When objects are linked that have the same weak symbols in them, the code from one copy is kept and the other copies discarded. Consequently, both .o files may have the function (despite it being nominally inline due to definition in class), but the executable linked from the .os only has one copy.

score 0 · Answer 3 · answered Dec 09 '13 at 08:55

0

Both functions Foo1 and Foo2 are in their own respective translational units. When you include foo.h, both gets a copy of the entire Foo class. Thus, the linker uses each units respective copy.

Now if you implmented Foo::foo() inside it's own source file, then both Foo1 and Foo2 would use the same foo function during linking.

answered Dec 09 '13 at 08:55

zackery.fix

1,786
2
11
20

How is that "the linker uses each units respective copy"?, you mean the final executable has the same code twice? – felknight Dec 09 '13 at 08:58
In a sense, yes. There may be times where these copied can be optimized away. However, in the above example each `.o` (trans. unit) file will have (and use) its own copy of `Foo::foo`. In the final executable, this may/may not be optimized away. If you WANT to use only one `Foo::foo` function in ALL units, then implement that function in a source file. – zackery.fix Dec 09 '13 at 09:00
Ok, I see, each trans. unit has it's own copy of Foo::foo, what I wanted to know is which of those is used since only one _ZN3Foo3fooEv can stay in the executable file – felknight Dec 09 '13 at 09:09
Did you dump the exe to see if there are two copies of Foo::foo there? Technically, it would not matter, considering that your using the same 'foo' function, just copied it twice. I am assuming there is two symbols for Foo::foo in the exe, then again... the linker should optimize this into a single symbol. – zackery.fix Dec 09 '13 at 09:14
Yeap, it created a single symbol that is used by the two functions. – felknight Dec 09 '13 at 09:20

how linker decides which implementation use

3 Answers3