I have a CPP program which is generated by wrap.py. wrap.py is used to produce wrapper for MPI program. It redirects any normal MPI call to PMPI call for intercepting purposes in order to do e.g. performance analysis. Pls download the generated code here. I use otf2 to trace MPI program.
To explain the code:
// test4.cpp
__attribute__((constructor)) void init(void)
{
if(!is_init)
{
archive = OTF2_Archive_Open( "./",
"ArchiveTest",
OTF2_FILEMODE_WRITE,
1024 * 1024 /* event chunk size */,
4 * 1024 * 1024 /* def chunk size */,
OTF2_SUBSTRATE_POSIX,
OTF2_COMPRESSION_NONE );
is_init = true;
}
}
__attribute__((destructor)) void fini(void)
{
if(is_init)
{
OTF2_Archive_Close( archive );
is_init = false;
}
}
I am going to compile the code into an .so file. So when it's imported the constructor
would be called; when the .so gets detached, the destructor
is called.
According to the official doc of otf2here, I compile the program:
mpic++ -fpic -c `otf2-config --cflags` -o test4.o test4.cpp
mpic++ -shared -o libtest4.so `otf2-config --ldflags` `otf2-config --libs` test4.o
If you extend the upper command line, you would get:
mpic++ -fpic -c -I/usr/include -o test4.o test4.cpp
mpic++ -shared -o libtest4.so -L/usr/lib -lotf2 -lm test4.o
The intercepted MPI program is from here.
Do interception:
$ mpirun -n 2 -x LD_PRELOAD=./libtest4.so ./send_recv
./send_recv: symbol lookup error: ./libtest4.so: undefined symbol: OTF2_Archive_Open
./send_recv: symbol lookup error: ./libtest4.so: undefined symbol: OTF2_Archive_Open
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[20246,1],0]
Exit code: 127
--------------------------------------------------------------------------
So it looks like mixing C and CPP causes problem. The linker couldn't correctly generate symbols for C function that is OTF2_Archive_Open
and OTF2_Archive_Close
.
I add 2 declarations to tell linker those are C functions(Download the modified programhere):
_EXTERN_C_ OTF2_Archive* OTF2_Archive_Open ( const char * archivePath,
const char * archiveName,
const OTF2_FileMode fileMode,
const uint64_t chunkSizeEvents,
const uint64_t chunkSizeDefs,
const OTF2_FileSubstrate fileSubstrate,
const OTF2_Compression compression
);
_EXTERN_C_ OTF2_ErrorCode OTF2_Archive_Close ( OTF2_Archive * archive );
But the problem above stays. And advice?
UPDATE1: OTF2 provides .a file, not .so file.
$ nm /usr/lib/libotf2.a| grep -i OTF2_Archive_Open
U otf2_archive_open
0000000000000000 T OTF2_Archive_Open
U otf2_archive_open_def_files
00000000000032e0 T OTF2_Archive_OpenDefFiles
U otf2_archive_open_evt_files
00000000000030e0 T OTF2_Archive_OpenEvtFiles
U otf2_archive_open_snap_files
00000000000034e0 T OTF2_Archive_OpenSnapFiles
U OTF2_Archive_Open
0000000000001180 T otf2_archive_open
0000000000005a40 T otf2_archive_open_def_files
U OTF2_Archive_OpenDefFiles
0000000000005880 T otf2_archive_open_evt_files
U OTF2_Archive_OpenEvtFiles
0000000000005c00 T otf2_archive_open_snap_files
U OTF2_Archive_OpenSnapFiles
$ ldd ./libtest4.so
linux-vdso.so.1 => (0x00007ffe3a6ce000)
libmpi_cxx.so.1 => /usr/lib/libmpi_cxx.so.1 (0x00007f4757d67000)
libmpi.so.12 => /usr/lib/libmpi.so.12 (0x00007f4757a91000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f475770e000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f47574f8000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f475712e000)
libibverbs.so.1 => /usr/lib/libibverbs.so.1 (0x00007f4756f1e000)
libopen-rte.so.12 => /usr/lib/libopen-rte.so.12 (0x00007f4756ca4000)
libopen-pal.so.13 => /usr/lib/libopen-pal.so.13 (0x00007f4756a07000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f47567e9000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f47564e0000)
/lib64/ld-linux-x86-64.so.2 (0x00005620bef03000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f47562dc000)
libhwloc.so.5 => /usr/lib/x86_64-linux-gnu/libhwloc.so.5 (0x00007f47560a1000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f4755e99000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f4755c96000)
libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f4755a8a000)
libltdl.so.7 => /usr/lib/x86_64-linux-gnu/libltdl.so.7 (0x00007f4755880000)
$ nm ./libtest4.so | grep -i OTF2_Archive_Open
U OTF2_Archive_Open
Weird is, I don't see any libotf2.a
in the output of ldd
. But if you try out the standard example of otf2 mpi writer from their website, it works out. And the output of ldd
for the standard example of otf2 mpi writer doesn't contain libotf2.a
either.
You could find the example here.