Use openMP only when an argument is passed to the program

Question

Is there a good way to use OpenMP to parallelize a for-loop, only if an -omp argument is passed to the program?

This seems not possible, since #pragma omp parallel for is a preprocessor directive and thus evaluated even before compile time and of course it is only certain if the argument is passed to the program at runtime.

At the moment I am using a very ugly solution to achieve this, which leads to an enormous duplication of code.

if(ompDefined) {
#pragma omp parallel for
  for(...)
    ...
}
else {
  for(...)
    ...
}

This https://stackoverflow.com/questions/4085595/conditional-pragma-omp may be of interest. — High Performance Mark, May 20 '18 at 09:02

Z boson · Accepted Answer · 2018-05-25T08:32:14.213

I think what you are looking for can be solved using a CPU dispatcher technique.

For benchmarking OpenMP code vs. non-OpenMP code you can create different object files from the same source code like this

//foo.c
#ifdef _OPENMP
double foo_omp() {
#else
double foo() {
#endif
  double sum = 0;
  #pragma omp parallel for reduction(+:sum)
  for(int i=0; i<1000000000; i++) sum += i%10;
  return sum;
}

Compile like this

gcc -O3 -c foo.c
gcc -O3 -fopenmp -c foo.c -o foo_omp.o

This creates two object files foo.o and foo_omp.o. Then you can call one of these functions like this

//bar.c
#include <stdio.h>

double foo();
double foo_omp();
double (*fp)();

int main(int argc, char *argv[]) {
  if(argc>1) {
    fp = foo_omp;
  }
  else {
    fp = foo;
  }
  double sum = fp();
  printf("sum %e\n", sum);
}

Compile and link like this

gcc -O3 -fopenmp bar.c foo.o foo_omp.o

Then I time the code like this

time ./a.out -omp
time ./a.out

and the first case takes about 0.4 s and the second case about 1.2 s on my system with 4 cores/8 hardware threads.

Here is a solution which only needs a single source file

#include <stdio.h>

typedef double foo_type();

foo_type foo, foo_omp, *fp;

#ifdef _OPENMP
#define FUNCNAME foo_omp
#else
#define FUNCNAME foo
#endif

double FUNCNAME () {
  double sum = 0;
  #pragma omp parallel for reduction(+:sum)
  for(int i=0; i<1000000000; i++) sum += i%10;
  return sum;
}

#ifdef _OPENMP
int main(int argc, char *argv[]) {
  if(argc>1) {
    fp = foo_omp;
  }
  else {
    fp = foo;
  }
  double sum = fp();
  printf("sum %e\n", sum);
}
#endif

Compile like this

gcc -O3 -c foo.c
gcc -O3 -fopenmp foo.c foo.o

andypea · Answer 2 · 2018-05-20T21:50:13.807

3

You can set the number of threads at run-time by calling omp_set_num_threads:

#include <omp.h>

int main() 
{
    int threads = 1;

    #ifdef _OPENMP
    omp_set_num_threads(threads);
    #endif

    #pragma omp parallel for
    for(...) 
    {
        ...
    }
}

This isn't quite the same as disabling OpenMP, but it will stop it running calculations in parallel. I've found it's always a good idea to set this using a command line switch (you can implement this using GNU getopt or Boost.ProgramOptions). This allows you to easily run single-threaded and multi-threaded tests on the same code.

As Vladimir F pointed out in the comments, you can also set the number of threads by setting the environment variable OMP_NUM_THREADS before executing your program:

gcc -Wall -Werror -pedantic -O3 -fopenmp -o test test.c 
OMP_NUM_THREADS=1
./test
unset OMP_NUM_THREADS

Finally, you can disable OpenMP at compile-time by not providing GCC with the -fopenmp option. However, you will need to put preprocessor guards around any lines in your code that require OpenMP to be enabled (see above). If you want to use some functions included in the OpenMP library without actually enabling the OpenMP pragmas you can simply link against the OpenMP library by replacing the -fopenmp option with -lgomp.

edited May 20 '18 at 21:50

answered May 20 '18 at 01:47

andypea

1,343
11
22

Thanks, this is a way better solution than I currently have.. Nonetheless I am doing some benchmarks and there is quite a noticable speed impact by running the code using one thread with OpenMP and running the code without OpenMP.... Unfortunately this will give me slightly wrong results. – freeDom- May 20 '18 at 02:01
1

Can't you just not provide the OpenMP switch to the compiler? It's been a while since I used OpenMP, but I I think GCC only enabled OpenMP if you passed it the -fopenmp option. – andypea May 20 '18 at 02:10
Unfortunately GCC is throwing a bunch of errors, if you don't provide -fopenmp for ever use of #pragma omp. – freeDom- May 20 '18 at 02:23
Are you sure they are errors and not just warnings? – andypea May 20 '18 at 02:28
Yes, at least with my Version of GCC and omp it's not compiling then. – freeDom- May 20 '18 at 02:32
btw: Thanks for telling me about getopt! For now I was doing all this stuff myself, including all the error handling... – freeDom- May 20 '18 at 02:35
1

I think you need to guard the header and `omp_set_num_threads(threads);` with `_OPENMP` preprocessor macro. – jww May 20 '18 at 03:09
Ah yes, it's working with guards everywhere! Thanks a lot! – freeDom- May 20 '18 at 03:21
The only tricky case I run into is the question of how to use omp_get_wtime when openmp is disabled by removing -fopenmp . The omp timer is usually the best portable one for Windows. I've seen the MPI timer used in apps without MPI. – tim18 May 20 '18 at 12:15
1

Much simpler option is to replace `-fopenmp` with `-lgomp` when you don't want the OpenMP pragmas to be used, but you still want the OpenMP library functions to be recognised. This method is a way of having an equivalent of stub OpenMP compiler switch. – Gilles May 20 '18 at 12:51
1

OMP_NUM_THREADS is simpler, does not need anything in your code at all. – Vladimir F Героям слава May 20 '18 at 17:14

Annoth · Answer 3 · 2018-05-20T17:10:53.310

One solution would be to use the preprocessor to ignore the pragma statement if you do not pass an additional flag to the compiler.

For example in your code you might have:

#ifdef MP_ENABLED
#pragma omp parallel for
#endif
for(...)
  ...

and then when you compile you can pass a flag to the compiler to define the MP_ENABLED macro. In the case of GCC (and Clang) you would pass -DMP_ENABLED.

You then might compile with gcc as

gcc SOME_SOURCE.c -I SOME_INCLUDE.h -lomp -DMP_ENABLED -o SOME_OUTPUT

then when you want to disable the parallelism you can make a minor tweek to the compile command by dropping -DMP_ENABLED.

gcc SOME_SOURCE.c -I SOME_INCLUDE.h -lomp -DMP_ENABLED -o SOME_OUTPUT

This causes the macro to be undefined which leads to the preprocessor ignoring the pragma.

You could also use a similar solution using ifndef instead depending on whether you consider the parallel behavior the default or not.

Edit: As noted by some comments, inclusion of OMP lib defines some macros such as _OPENMP which you could use in place of your own user-defined macros. That looks to be a superior solution, but the difference in effort is reasonably small.

I tried something similiar, but didn't think of using the compiler/Makefile to pass the flag... This seems like a good option to me! — freeDom-, May 20 '18 at 02:57
Why would you add a requirement for the user to `-DMP_ENABLED` when `-fopenmp` defines `_OPENMP` under conforming compilers? — jww, May 20 '18 at 03:07
Good point about the __OPENMP. Not sure if OP is using a conforming compiler. — Annoth, May 20 '18 at 15:03

Use openMP only when an argument is passed to the program

3 Answers3