I'm trying to make compilations of the GHC Haskell compiler 100% reproducible (byte-identical).
The object files are already byte-identical, but the final linked binary isn't.
GHC delegates the final linking to gcc
, like:
/usr/bin/gcc -fno-stack-protector -DTABLES_NEXT_TO_CODE -o Main Main.o [..some more files..] /tmp/ghc21220_0/ghc21220_5.o /tmp/ghc21220_0/ghc21220_7.o [...] '-Wl,--hash-size=31' -Wl,--reduce-memory-overheads
Interestingly, the file name of the temporary file ghc21220_7.o
appears in the linked binary.
It seems that I am able to remove it with the strip
tool.
Why does the file name appear there, what is its purpose?
Is there a flag to tell gcc
(or maybe ld
?) to not include these file names?
Update: If I run objdump --syms
on the binary, I see
0000000000000000 l df *ABS* 0000000000000000 ghc21220_5.c
0000000000000000 l df *ABS* 0000000000000000 ghc21220_7.c
According to this d
means debug and f
means file. My question remains: Why and how exactly do the file names the .c
files make it into the final binary, and can I suppress this at compile time (as opposed to running strip
later)?