23

I've been struggling a weird problem the last few days. We create some libraries using GCC 4.8 which link some of their dependencies statically - eg. log4cplus or boost. For these libraries we have created Python bindings using boost-python.

Every time such a library used TLS (like log4cplus does in it's static initialization or stdlibc++ does when throwing an exception - not only during initialization phase) the whole thing crashed in a segfault - and every time the address of the thread local variable has been 0.

I tried everything like recompiling, ensuring -fPIC is used, ensuring -tls-model=global-dynamic is used, etc. No success. Then today I found out that the reason for these crashes has been our way of linking OpenMP in. We did this using "-lgomp" instead of just using "-fopenmp". Since I changed this everything works fine - no crashes, no nothing. Fine!

But I'd really like to know what the cause of the problem was. So what's the difference between these two possibilities to link in OpenMP?

We have a CentOS 5 machine here where we have installed a GCC-4.8 in /opt/local/gcc48 and we are also sure that the libgomp coming from /opt/local/gcc48 had been used as well as the libstdc++ from there (DL_DEBUG used).

Any ideas? Haven't found anything on Google - or I used the wrong keywords :)

duselbaer
  • 935
  • 2
  • 6
  • 10
  • -pthread or -lpthread had been there – duselbaer Apr 08 '14 at 13:24
  • 1
    Compile with `-v` and compare the output... – Marc Glisse Apr 08 '14 at 13:25
  • 2
    Adding -v as a linker option shows that -fopenmp implicitely adds a -lgomp at the end. Everything else stays the same. Without -fopenp I have "-lstdc++ -lm -lgcc_s -lpthread -lc -lgcc_s" and with -fopenmp it becomes "-lstdc++ -lm -lgomp -lgcc_s -lpthread -lc -lgcc_s". I still don't see the reason for the crashes because all of these libraries are linked dynamically :( – duselbaer Apr 09 '14 at 05:15
  • 1
    Then it could be the order of the -l flags that matters. Maybe it is important that -lgomp is before -lpthread or some other permutation. You could try playing with LD_PRELOAD to see if loading the dependencies in a different order makes a difference. – Marc Glisse Apr 09 '14 at 07:50
  • LD_PRELOADing libgomp.so indeed works - so something interesting seems to happen when loading OpenMP regarding TLS... let's see if we can find out what exactly goes on there... – duselbaer Apr 09 '14 at 08:14

1 Answers1

21

OpenMP is an intermediary between your code and its execution. Each #pragma omp statement are converted to calls to their according OpenMP library function, and it's all there is to it. The multithreaded execution (launching threads, joining and synchronizing them, etc.) is always handled by the Operating System (OS). All OpenMP does is handling these low-level OS-dependent threading calls for us portably in a short and sweet interface.

The -fopenmp flag is a high-level one that does more than include GCC's OpenMP implementation (gomp). This gomp library will require more libraries to access the threading functionality of the OS. On POSIX-compliant OSes, OpenMP is usually based on pthread, which needs to be linked. It may also need the realtime extension library (librt) to work on some OSes, while not on some other. When using dynamic linking, everything should be discovered automatically, but when you specified -static, I think you've fallen in the situation described by Jakub Jelinek here. But nowadays, pthread (and rt if needed) should be automatically linked when -static is used.

Aside from linking dependencies, the -fopenmp flag also activates some pragma statement processing. You can see throughout the GCC code (as here and here) that without the -fopenmp flag (which isn't trigged by only linking the gomp library), multiple pragmas won't be converted to the appropriate OpenMP function call. I just tried with some example code, and both -lgomp and -fopenmp produce a working executable that links against the same libraries. The only difference in my simple example that the -fopenmp has a symbol that the -lgomp doesn't have: GOMP_parallel@@GOMP_4.0+ (code here) which is the function that initializes the parallel section performing the forks requested by the #pragma omp parallel in my example code. Thus, the -lgomp version did not translate the pragma to a call to GCC's OpenMP implementation. Both produced a working executable, but only the -fopenmp flag produced a parallel executable in this case.

To wrap up, -fopenmp is needed for GCC to process all the OpenMP pragmas. Without it, your parallel sections won't fork any thread, which could wreak havoc depending on the assumptions on which your inner code was done.

Soravux
  • 9,653
  • 2
  • 27
  • 25
  • With your example did you build (compilation) of the source with `-fopenmp` and then used your `.o` in linking with `-fopenmp` / `-lgomp`? When you use `gcc -fopenmp example.c` it will enable omp pragma in compilation and add library in linking; but single command compile+link in form of `gcc -lgomp example.c` will not pass openmp-enabling option to compilation and pragma omp will be ignored. – osgx Nov 02 '17 at 16:35
  • 1
    I may be wrong, but I believe we are saying the same thing? I wrote "Only `-fopenmp` will produce a parallel executable" while you wrote something like "`-lgomp` will not produce a parallel executable", or did I miss something? (As I wrote, pragmas won't be converted to a function call without `-fopenmp`) In any case, I agree with what you said, and I believe the answer says the same thing. Maybe my answer's wording could be better, though... – Soravux Nov 03 '17 at 05:50