Summary:
Cross used function between STATIC and SHARED lib lead to have all objects of STATIC lib (even unused!) to be included in final binary!
You don't understand what I mean I suppose ? :-p
Sit and read the full story below ! Name have been change to protect the innocent. Example's target was simplicity and reproducibility.
Teaser : there's a SSCCE available! ( Short, Self Contained, Correct (Compilable), Example : http://www.sscce.org/ )
At beginning, I had :
a binary (
main
) calling a function (fun1a()
) stored in a STATIC lib (libsub.a
).main
also have an internal function (mainsub()
).a STATIC lib (
libsub.a
) that is containing SEVERAL objects each with several functions used by other sources.
Compiling main
result in a binary having ONLY a copy of the object(s) (STATIC lib) containing the referenced functions.
In the example below, main
will only contain functions of object shared1.o
(because main is calling func1a()
) and NOT functions of shared2.o
(because no references).
OK !
main.c libsub.a
+-------------+ +------------+
| main | | shared1.o |
| func1a() | <----> | func1a() |
| mainsub() | | func1b() |
+-------------+ | ---- |
| shared2.o |
| func2a() |
| func2b() |
+------------+
As an improvement, I wanted to allow 'external' people to be able to overwrite functions called in main
by their own code, without having to recompile MY binary.
I didn't provide source anyway, nor my static lib.
To do so, I intended to provide a "ready to fill" function skeleton source. (That's called a USER-EXIT ?! )
The use of SHARED / DYNAMIC lib could do that IMHO.
The functions that could be overwritten, are either internal to main (mainsub()
) or shared functions (func1a()
...) and would be stored in shared lib (.so) to be added/referenced during link.
New sources were created, prefixed with 'c', that would contain the 'Client' version of the 'standard' functions.
The switch of using (or not) overwritten function is out of scope. Just take as is that if UE
is true, then overwritten is made.
cmain.c
is a new source containing Client_mainsub()
that could be called 'in replacement' of mainsub()
cshared1.c
is a new source containing Client_func1a()
that could be called 'in replacement' of func1a()
. Indeed all functions in shared1.c
could have their replacement in cshared1.c
cshared2.c
is a new source containing Client_func2a()
that could be called 'in replacement' of func2a()
The overview becomes :
main.c libsub.a clibsub.so
+-----------------------+ +------------------------+ +--------------------+
| main | | shared1.o | | cshared1.o |
| func1a() {} | | func1a() | | Client_func1a() |
| mainsub() | <-> | { if UE | <-> | {do ur stuff } |
| { if UE | | Client_func1a() | | |
| Client_mainsub() | | return } | | cshared2.o |
| return }| | func1b() | | Client_func2a() |
+-----------------------+ | ------- | >| {do ur stuff } |
^ | shared2.o | / +--------------------+
cmain.c v | func2a() | /
+--------------------+ | { if UE | /
| cmain | | Client_func2a() |<
| Client_mainsub() | | return } |
| {do ur stuff } | | func2b() |
+--------------------+ +------------------------+
Here again, as main
do not call func2a()
nor func2b()
, the (STATIC) object shared2.o is not included in the binary, and no reference to (SHARED) Client_func2a()
exist either.
OK !
Finally, simply overwriting functions was not enough (or too much !). I wanted external people to be able to call my function (or not) ... but ALSO allow them to do some stuff right BEFORE and/or right AFTER my function.
So instead of having func2a()
stupidly replaced by Client_func2a()
, we would have roughly in pseudo code:
shared2.c | cshared2.c
(assume UE=true)
|
func2a() { |Client_func2a() {
if UE {} |
Client_func2a() ==> do (or not) some stuf PRE call
|
| if (DOIT) { // activate or not standard call
| UE=false
| func2a() // do standard stuff
| UE=true
| } else
| { do ur bespoke stuff }
|
| do (or not) some stuf POST call
| }
<==
} else
{ do standard stuff }
}
Remember that cshared2.c
is provided to other people that could (or not) do their own stuff on the provided skeleton.
(Note : Setting UE
to false and back to true in Client_func2a()
avoids infinite loop in func2a()
call ! ;-) )
Now comes my problem.
In that case, the result binary now includes shared2.o
object despite NO call is made in main to any function of shared2.c
nor cshared2.c
!!!!!
After searching this looks to be because of the cross calls/reference :
shared2.o contains func2a() that may call Client_func2a()
cshared2.o contains Client_func2a() that may call func2a()
So why main
binary is containing shared2.o ?
>dump -Tv main
main:
***Loader Section***
***Loader Symbol Table Information***
[Index] Value Scn IMEX Sclass Type IMPid Name
[0] 0x00000000 undef IMP RW EXTref libc.a(shr_64.o) errno
[1] 0x00000000 undef IMP DS EXTref libc.a(shr_64.o) __mod_init
[2] 0x00000000 undef IMP DS EXTref libc.a(shr_64.o) exit
[3] 0x00000000 undef IMP DS EXTref libc.a(shr_64.o) printf
[4] 0x00000000 undef IMP RW EXTref libc.a(shr_64.o) __n_pthreads
[5] 0x00000000 undef IMP RW EXTref libc.a(shr_64.o) __crt0v
[6] 0x00000000 undef IMP RW EXTref libc.a(shr_64.o) __malloc_user_defined_name
[7] 0x00000000 undef IMP DS EXTref libcmain.so Client_mainsub1
[8] 0x00000000 undef IMP DS EXTref libcshared.so Client_func1b
[9] 0x00000000 undef IMP DS EXTref libcshared.so Client_func1a
[10] 0x00000000 undef IMP DS EXTref libcshared.so Client_func2b <<< but why ??? ok bcoz func2b() is referenced ...
[11] 0x00000000 undef IMP DS EXTref libcshared.so Client_func2a <<< but why ??? ok bcoz func2a() is referenced ...
[12] 0x110000b50 .data ENTpt DS SECdef [noIMid] __start
[13] 0x110000b78 .data EXP DS SECdef [noIMid] func1a
[14] 0x110000b90 .data EXP DS SECdef [noIMid] func1b
[15] 0x110000ba8 .data EXP DS SECdef [noIMid] func2b <<< but why this ? Not a single call is made in main ???
[16] 0x110000bc0 .data EXP DS SECdef [noIMid] func2a <<< but why this ? Not a single call is made in main ???
Note that simply putting in comment func2a()
( and func2b()
) solves the link issue (breaking the cross)... but it's not possible as I would like to keep a shared lib !?
The behavior is happening on AIX 7.1 with IBM XL C/C++ 12.1 , but it looks to be the same on Linux (Red Hat 5 + GCC 5.4 with some small changed in compilation param)
IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72)
Version: 12.01.0000.0000
Driver Version: 12.01(C/C++) Level: 120315
C Front End Version: 12.01(C/C++) Level: 120322
High-Level Optimizer Version: 12.01(C/C++) and 14.01(Fortran) Level: 120315
Low-Level Optimizer Version: 12.01(C/C++) and 14.01(Fortran) Level: 120321
So I figure out this is surely a misunderstanding. Can anyone explain ?
As promised here are the SSCCE. You can replay my problem by recreating/downloading the following small files and run go.sh (see comment inside the script)
Edit1 : added code into the question, not on external site as suggested
main.c
#include <stdio.h>
#include "inc.h"
extern void func1a (), func1b ();
int UEXIT(char* file, char* func)
{
printf(" UEXIT file=<%s> func=<%s>\n",file,func);
return 1; /* always true for testing */
}
main (){
printf(">>> main\n");
func1a ();
mainsub ();
printf("<<< main\n");
}
mainsub () {
printf(">>> mainsub\n");
if(UEXIT("main","mainsub")) {
Client_mainsub1();
return;
}
printf("<<< mainsub\n");
}
cmain.c
#include <stdio.h>
#include "inc.h"
void Client_mainsub1 () {
printf(">>>>>> Client_mainsub1\n");
printf("<<<<<< Client_mainsub1\n");
return;
}
inc.h
extern int UEXIT(char * fileName, char * functionName);
shared1.c
#include <stdio.h>
#include "inc.h"
void func1a (){
printf(">>>>> func1a\n");
if(UEXIT("main","func1a")) {
Client_func1a();
return;
}
printf("<<<<< func1a\n");
}
void func1b (){
printf(">>>>> func1b\n");
if(UEXIT("main","func1b")){
Client_func1b();
return;
}
printf("<<<<< func1b\n");
}
shared2.c
#include <stdio.h>
#include "inc.h"
void func2a (){
printf(">>>>> func2a\n");
if(UEXIT("main","func2a")) {
Client_func2a();
return;
}
printf("<<<<< func2a\n");
}
void func2b (){
printf(">>>>> func2b\n");
if(UEXIT("main","func2b")){
Client_func2b();
return;
}
printf("<<<<< func2b\n");
}
cshared1.c
#include <stdio.h>
#include "inc.h"
void Client_func1a () {
int standardFunctionCall = 0;
printf("\t>>>> Client_func1a\n");
if (standardFunctionCall) {
func1a();
}
printf("\t<<< Client_func1a\n");
return;
}
void Client_func1b () {
int standardFunctionCall = 0;
printf("\t>>>> Client_func1b\n");
if (standardFunctionCall) {
func1b();
}
printf("\t<<< Client_func1b\n");
return;
}
cshared2.c
#include <stdio.h>
#include "inc.h"
void Client_func2a () {
int standardFunctionCall = 0;
printf("\t>>>> Client_func2a\n");
if (standardFunctionCall) {
func2a(); /* !!!!!! comment this to avoid crossed link with shared2.c !!!!! */
}
printf("\t<<< Client_func2a\n");
return;
}
void Client_func2b () {
int standardFunctionCall = 0;
printf("\t>>>> Client_func2b\n");
if (standardFunctionCall) {
func2b(); /* !!!!!! ALSO comment this to avoid crossed link with shared2.c !!!!! */
}
printf("\t<<< Client_func2b\n");
return;
}
go.sh
#!/bin/bash
## usage :
## . ./go.sh
## so that the redefinition of LIBPATH is propagated to calling ENV ...
## otherwise : "Dependent module libcshared.so could not be loaded."
# default OBJECT_MODE to 64 bit (avoid explicitely setting -X64 options...)
export OBJECT_MODE=64
export LIBPATH=.:$LIBPATH
# Compile client functions for target binary
cc -q64 -c -o cmain.o cmain.c
# (1) Shared lib for internal function
cc -G -q64 -o libcmain.so cmain.o
# Compile common functions
cc -c shared2.c shared1.c
# Compile client common functions overwrite
cc -c cshared2.c cshared1.c
# (2) Built libsub.a for common functions (STATIC)
ar -rv libsub.a shared1.o shared2.o
# (3) Built libcshared.so for client common functions overwrite (SHARED)
cc -G -q64 -o libcshared.so cshared1.o cshared2.o
# Finally built binary using above (1) (2) (3)
# main only call func1a() , so should only include objects shared1
# But pragmatically shared2 is also included if cshared2 reference a possible call to func2() in shared2 !!!!????
# Check this with "nm main |grep shared2" or "nm main |grep func2" or "dump -Tv main |grep func2"
cc -q64 -o main main.c -bstatic libsub.a -bshared libcmain.so libcshared.so
# result is the same without specifying -bstatic or -bshared
#cc -q64 -o main2 main.c libsub.a libcmain.so libcshared.so
#If I split libcshared.so into libcshared1.so and libcshared2.so it is also the same :
#cc -G -q64 -o libcshared1.so cshared1.o
#cc -G -q64 -o libcshared2.so cshared2.o
#cc -q64 -o main4 main.c -bstatic libsub.a -bshared libcmain.so libcshared1.so libcshared2.so
#If I do not inlcude libcshared2.so, binary is of course well working, without reference to cshared2 nor shared2 .
# So why linker chooses to add STATIC shared2.o only if libcshared2.so is listed ?
# Is there a way to avoid this add of unused code ?
#cc -q64 -o main4 main.c -bstatic libsub.a -bshared libcmain.so libcshared1.so
Edit2 : added RedHat version of go.sh script as requested
gored.sh
## usage :
## . ./gored.sh
## so that the redefinition of LD_LIBRARY_PATH is propagated to calling ENV ...
## otherwise : "Dependent module libcshared.so could not be loaded."
export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
# Compile client functions for target binary
gcc -fPIC -c cmain.c
# (1) Shared lib for internal function
gcc -shared -o libcmain.so cmain.o
# Compile common functions
gcc -c shared2.c shared1.c
# Compile client common functions overwrite
gcc -fPIC -c cshared2.c cshared1.c
# (2) Built libsub.a for common functions (STATIC)
ar -rv libsub.a shared1.o shared2.o
# (3) Built libcshared.so for client common functions overwrite (SHARED)
gcc -shared -o libcshared.so cshared1.o cshared2.o
# Finally built binary using above (1) (2) (3)
# main only call func1a() , so should only include objects shared1
# But pragmatically shared2 is also included if cshared2 reference a possible call to func2() in shared2 !!!!????
# Check this with "nm main |grep shared2" or "nm main |grep func2" or "dump -Tv main |grep func2"
gcc -o main main.c libcmain.so libcshared.so libsub.a
#If I split libcshared.so into libcshared1.so and libcshared2.so it is also the same :
gcc -shared -o libcshared1.so cshared1.o
gcc -shared -o libcshared2.so cshared2.o
cc -o main2 main.c libcmain.so libcshared1.so libcshared2.so libsub.a
#If I do not inlcude libcshared2.so, binary is of course well working, without reference to cshared2 nor shared2 .
# So why linker chooses to add STATIC shared2.o only if libcshared2.so is listed ?
# Is there a way to avoid this add of unused code ?
cc -o main3 main.c libcmain.so libcshared1.so libsub.a
Or here the full above files (without gored.sh) in a single .tar.bz2. (6KB).
Just copy/paste in a new file (ex poc.uue
). Then type
uudecode poc.uue
and you should get poc.tar.bz2
unzip, untar go into poc folder and run
. ./go.sh
then
dump -Tv main
or if under RedHat
nm main
example of result after gored.sh
:
poc>nm main |grep func2
* U Client_func2a
U Client_func2b
0000000000400924 T func2a
000000000040095d T func2b
poc>nm main2 |grep func2
U Client_func2a
U Client_func2b
0000000000400934 T func2a
000000000040096d T func2b
poc>nm main3 |grep func2
poc>
Edit3: ASCII ART ! :-)
Here's the 'visual' final state with unused objects/references I think the linker is wrong to include. Or at least not smart enough to detect as unused.
Maybe that's normal or there's an option to avoid having unused static code in final binary. This doesn't look as a complex situation as the surounded tagged 'UNUSED !?' code is linked with nothing ? Isn't it ?
main.c libsub.a clibsub.so
+-----------------------+ +-------------------------+ +-----------------------------+
| main | | +---------------------+ | | +-------------------------+ |
| func1a(); <-------------\ | |shared1.o | | | | cshared1.o | |
| mainsub() | \------>func1a() { <-------------+ /-----> Client_func1a() { | |
| { if UE { | | | if UE { | | | / | | PRE-stuff | |
| Client_mainsub() | | | Client_func1a() <-----C---/ | | if (DOIT) { | |
| return ^ | | | return | | | | | UE=false | |
| } | | | | } else { | | +----------------> func1a() | |
| } | | | | do std stuff | | | | UE=true | |
+-------------|---------+ | | } | | | | } else { | |
| | | | | | | do bespoke stuff | |
| | | func1b() { | | | | } | |
| | | same as above | | | | POST-stuff | |
| | | } | | | | } | |
| | +---------------------+ | | | Client_func1b() {} | |
| | | | +-------------------------+ |
| ***|*******U*N*U*S*E*D**?!***|*****U*N*U*S*E*D**?!*******U*N*U*S*E*D**?!****
| * | +---------------------+ | | +-------------------------+ | *
| U | |shared2.o | | | | cshared2.o | | U
| * | | func2a() { <-------------+ /-----> Client_func2a() { | | *
| N | | if UE { | | | / | | PRE-stuff | | N
cmain.so | * | | Client_func2a())<-----C---/ | | if (DOIT) { | | *
+-------------|------+ U | | return | | | | | UE=false | | U
| cmain.o v | * | | } else { | | +----------------> func2a() | | *
| Client_mainsub() | S | | do std stuff | | | | UE=true | | S
| {do ur stuff } | * | | } | | | | } else { | | *
+--------------------+ E | | | | | | do bespoke stuff | | E
* | | func2b() { | | | | } | | *
D | | same as above | | | | POST-stuff | | D
* | | } | | | | Client_func2b() {} | | *
* | +---------------------+ | | +-------------------------+ | *
? +-------------------------+ +---------------------------+ | ?
! !
*********U*N*U*S*E*D**?!*************U*N*U*S*E*D**?!******U*N*U*S*E*D**?!***
Any constructive answer to put me on the right way is welcome.
Thanks.