5

I have been Googling this one for a good long while now and am I'm not seeing anything quite like this, so here goes. I am trying to create a small statically-linked binary which can be easily distributed across machines on my home network. This is a pretty small project so I'm trying to keep things simple.

I am running into substantial difficulties when I statically link the pthread library on the ARM 32-bit architecture. Frustratingly, the very same code works just fine on all versions of x86. Here is my test.cpp program:

void threader( int num ) {

        std::cout << "Child Thread Starting" << std::endl;

        try {
                throw 20;
        } catch (int e) {
                std::cout << "Child Thread Success" << std::endl;
        }

        int x = 0;
        do {
                x++;
        } while (true);
}

int main(int argc, char *argv[]) {

        std::cout << "Main Thread Starting" << std::endl;

        new std::thread(&threader, 0);

        try {
                throw 20;
        } catch (int e) {
                std::cout << "Main Thread Success" << std::endl;
        }

        int y = 0;
        do {
                y++;
        } while (true);
}

The idea is that the main thread starts a child thread, then tests to see if the main thread can throw an exception, then spins. Meanwhile, the child thread also tests to see if it can throw an exception, then spins. The original code throws a boost-brand exception, leading to a crash. This minimal example has identical behavior.

A successful result on x86 is as follows:

# g++ -m32 -std=c++11 -c -g test.cpp
# g++ -static -m32 *.o -o test -lrt -pthread
# ./test
Main Thread Starting 
Child Thread Starting 
Main Thread Success 
Child Thread Success 

However, on ARM, I get a variety of errors depending on how exactly I link things. For reference, I started with the simple:

$ g++ -std=c++11 -c -g test.cpp             
$ g++ -static *.o -o test -lrt -pthread
$ ./test
Main Thread Starting
Child Thread Starting
Main Thread Success
Segmentation fault

The segmentation fault is not very helpful:

(gdb) bt
#0  0x00012be8 in __cxa_throw ()
#1  0x000108d0 in threader (num=0) at test.cpp:11
#2  0x00011a26 in std::_Bind_simple<void (*(int))(int)>::_M_invoke<0u>(std::_Index_tuple<0u>) (this=0xda50c) at /usr/include/c++/4.8/functional:1732
#3  0x00011950 in std::_Bind_simple<void (*(int))(int)>::operator()() (this=0xda50c) at /usr/include/c++/4.8/functional:1720
#4  0x0001190a in std::thread::_Impl<std::_Bind_simple<void (*(int))(int)> >::_M_run() (this=0xda500) at /usr/include/c++/4.8/thread:115
#5  0x0001befc in execute_native_thread_routine ()
#6  0x0004bcc2 in start_thread (arg=0x0) at pthread_create.c:335
#7  0x00071b4c in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

When throw() is causing segmentation faults, you got problems. When I remove the "-static" flag, everything executes perfectly, as in the x86 case.

After extensive googling I found that this problem is apparently quite common. Other answers include:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52590

when g++ static link pthread, cause Segmentation fault, why?

And many other similar ones. The key recommendation appears to be to link with the "-Wl,--whole-archive -lpthread -Wl,--no-whole-archive" phrase. Ok, lets give it a go:

$ g++ -std=c++11 -c -g test.cpp  
$ g++ -static *.o -o test -lrt -pthread -Wl,--whole-archive -lpthread -Wl,--no-whole-archive
$ ./test
Main Thread Starting
Child Thread Starting
terminate called after throwing an instance of 'Segmentation fault

Well, it gets points for being different. GDB:

(gdb) r
Starting program: ./test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Main Thread Starting
[New Thread 0x76ffc2d0 (LWP 30844)]
Child Thread Starting
terminate called after throwing an instance of 'int'
terminate called recursively

Thread 1 "test" received signal SIGABRT, Aborted.
0x00055266 in __libc_do_syscall ()
(gdb) bt
#0  0x00055266 in __libc_do_syscall ()
#1  0x0001ab66 in raise (sig=6) at ../sysdeps/unix/sysv/linux/pt-raise.c:35
#2  0x0005a85a in abort ()
#3  0x0004bd3c in __gnu_cxx::__verbose_terminate_handler() ()
#4  0x00026d34 in __cxxabiv1::__terminate(void (*)()) ()
#5  0x00026d50 in std::terminate() ()
#6  0x0001ca3c in __cxa_rethrow ()
#7  0x0004bd1c in __gnu_cxx::__verbose_terminate_handler() ()
#8  0x00026d34 in __cxxabiv1::__terminate(void (*)()) ()
#9  0x00026d50 in std::terminate() ()
#10 0x0001c9fc in __cxa_throw ()
#11 0x0001098e in main (argc=1, argv=0x7efff704) at test.cpp:28
(gdb) frame 11
#11 0x0001098e in main (argc=1, argv=0x7efff704) at test.cpp:28
28          throw 20;
(gdb)

Two things to note: one, it is now the parent thread which is crashing, and two, it is apparently an 'int' which is segfaulting. Ok... that's special.

There is another answer on SO which seems to note something similar to this:

GCC: --whole-archive recipe for static linking to pthread stopped working in recent gcc versions

But this solution recommends jiggering around the order and frequency of the -lrt and -lpthread flags. I have tried... a great many combinations of these. It always results in the above behavior.

I will also note that the example program in the above issue runs perfectly fine on the problem ARM32 system. However, it immediately breaks if I add the "throw 20" test block to the thread function.

For the record, I have also tried many combinations with boost::thread and have also tried compiling with clang, all to the same results.

At this point I am at a complete loss and throw myself on the mercies of the internet. Does anyone have any idea what is going on here, or how I can investigate more?

gf3dev
  • 51
  • 2
  • probably not related but note that `do { } while(true);` is strictly undefined behaviour – Alan Birtles Oct 27 '20 at 10:08
  • Hey Alan. I have augmented the empty do { } while loop to do some simple counting as it spins. Unfortunately, this has resulted in no change. – gf3dev Oct 27 '20 at 10:12
  • I suspect you should also compile with -pthread as well as link. Using -pthread sets other options, like defining _REENTRANT. – janm Oct 27 '20 at 10:21
  • Hey Janm, would you mind giving me some more words on what you propose? Is this a matter of setting compile flags properly, or do you recommend including some header files? I've tried a few obvious options with no result. – gf3dev Oct 27 '20 at 10:31
  • You need to pass `-pthread` on the compiler command in addition to the linker command – Alan Birtles Oct 27 '20 at 10:43
  • I am sad to report that adding -pthread to the compilation step does not change the results at all. – gf3dev Oct 27 '20 at 10:58
  • Static linking is a bad idea anyway, it is poorly supported and most people recommend against it. – n. m. could be an AI Oct 27 '20 at 12:20
  • @gf3dev In your examples, you compile "test.c" but link against "*.o". Why? For a simple test, create a single source file that shows your problem, and then compile and link on one command with -pthread, something like `g++ -static -pthread -o test test.cpp -lrt -Wl,--whole-archive -lpthread -Wl,--no-whole-archive` – janm Oct 27 '20 at 16:12
  • 1
    @gf3dev All of your object files should be compiled with the same "-pthread" option. – janm Oct 27 '20 at 16:12
  • @janm I was breaking up the compilation step (.cpp -> .o) as it made testing fractionally faster. I have tried what you suggested and combined both steps into one command, but I'm afraid that it has had no effect on the issue. " terminate called after throwing an instance of 'int' " Also, the single source file you requested is at the top of this help request, if you'd like to try it out if you happen to have an ARM32 machine laying around. – gf3dev Oct 27 '20 at 21:03

0 Answers0