Confusion about how symbols that has their definition in another file are handled

Question

I have read this and this, however, I still do not get it. Elaboration follows.. (the example below is stolen from the question in which the second link is the answer to - it is tweaked a bit though)

ex. Say I have 3 files

main.cpp

myfunction.cpp

myfunction.hpp

//main.cpp

#include "myfunction.hpp"
void localfunction() {};
int main() {
  int A = myfunction( 12 );
  ...
}

//myfunction.cpp

#include "myfunction.hpp"
int myfunction( int x ) {
  retu
rn x * x;
}

//myfunction.hpp

int myfunction( int x );

So the object file created of main.cpp has at least one symbol table.

But will this symbol table contain the symbols localfunction and myfunction where it will have the address for localfunction that could be 0x00233 and instead of an address for myfunction, it holds a "reference" that serves as a note for the linker that it has to connect this symbol with an address found somewhere else (that is a definition), but only if the compiler can see that there is a definition directly included in the current translation unit?

So like the compiler checks if it can find a definition to the symbol (could also be an object) and if it can, it can directly include an address in the symbol table, but if it cannot find any, it adds a reference/note as the address for the corresponding symbol that the linked has to find the address...

Then when the linking process begins, the linker will search for corresponding addresses that are definitions to the given symbol that has been somewhat marked "has no definition" or what?

This is what I understand so far based on my understanding of different sources (including the wiki article) but I am still really confused about this, cannot see it visually, and the understanding written above is probably off and it should be seen as a question rather than an explanation.

Can anybody help?

AFAIK the way it's done is implementation defined. – Hatted Rooster Aug 27 '17 at 17:39 — Hatted Rooster, Aug 27 '17 at 17:39

score 0 · Answer 1 · answered Aug 27 '17 at 17:59

0

Although the exact process is implementation-defined, the overall approach is as follows:

Each object file has a table of symbols that it defines, along with the address of each symbol. This is localfunction/233 from your example. There is also a second definition for main/some address in that file.
In addition, each object file lists names that it needs, along with one or more places where the address needs to be plugged in. In your case that would be myfunction, along with the address inside main's body where the defined address is to be inserted. If you call myfunction from multiple places, there would be multiple places where the defined address needs to be plugged in.
When the linker assembles the executable, it checks the list of supplied definitions against the list of required definitions.
First, linker checks that for each unsatisfied reference there is exactly one definition. Otherwise, the linker issues multiple defs / undefined symbol errors.
After that, the linker plugs in the address of each defined symbol into spots from which they are referenced.

This concludes the linking process. Note that some symbols may remain unreferenced. In this case the linker can optionally remove the definitions from the final result.

answered Aug 27 '17 at 17:59

Sergey Kalinichenko

714,442
84
1,110
1,523

Thank you. I am currently try to grasp it and I will probably ask you some questions later on... But for now: How did you learn this? – Aug 27 '17 at 18:52
Regarding your second point - I do not really get that; how will myfunction and localfunction look in the symbol table for main.cpp? – Aug 27 '17 at 20:15
And I still do not get how the symbol table handles symbols that have no definitions in the current translation unit and tells the linker that it needs help, sorry. Can you please elaborate? – Aug 27 '17 at 20:37
@FacPam Let's say main starts at 300 and references myfunction at 308. Let's say it calls myfunction once more from the address 334. Then the second table (the addresses I need) would contain `myfunction`/308,334. Linker would see this, look up myfunction in the first section of myfunction.o object file, and fill in its address (say, 582) at the addresses 308 and 334. – Sergey Kalinichenko Aug 28 '17 at 01:22
Okay, I think I got that. if we called myfunction from let's say address 350, would there be left a reference/note for the linker in the symbol table that tells the linker that at this address, it has to find the definition in another file and then input that address into the address we called myfunction from? – Aug 28 '17 at 05:57
Did I get it right? Also, so can you confirm that C++ handles dependencies of other translation units are by linkers _and_ symbol tables? Where it is in the symbol tables that the compiler leaves "notes" for the linker about what definitions are needed to be found in other translation units? Thank you so much for your time! – Aug 28 '17 at 06:00
1

@FacPam yes and yes. – Sergey Kalinichenko Aug 28 '17 at 09:42
Hi again! I was wondering about what will it look like IF the definition was available in the current translation unit? Will that definition then just have a known address and that address will be connected to the name? I think that it would be a little overkill to create a whole new question for this... :) Thanks again! – Sep 10 '17 at 15:22
Oh hold on, I think that that was the case for "localfunction", which you already have answered :) – Sep 10 '17 at 16:23
Or maybe not: In your first point, you say "This is localfunction/233 from your example", is 233 the address of the definition or the address where the definition has to be "plugged in" like as it was with names that did not have a symbol inside the current translation unit? – Sep 10 '17 at 16:27
If the "233" would be where the address needs to be plugged in, how is it then connecting the symbol with the definition that is available in the translation unit? Like it is for "localfunction" or maybe a variable that you just defined? I also just saw that I forgot to accept your answer... – Sep 10 '17 at 16:29
Sry for my spamming, but would you advise that I make a new question regarding what happens ("overall") when the definition is present in the current translation unit? – Sep 10 '17 at 20:00

Confusion about how symbols that has their definition in another file are handled

1 Answers1