2

The question is inspired by OpenMP with BLAS

The motivation is, I want the Fortran source code to be flexible to the complier options related to serial/parallel BLAS. I may specify -mkl=parallel for mkl or USE_OPENMP=1 for lopenblas in the Makefile. I may do make ifort or make gfortran or make blah blah to switch the libaries in the Makefile. But,

a) If I use -mkl=parallel in the Makefile, I need to set call mkl_set_num_threads(numthreads) in the source code,

b) If I use OpenBLAS with USE_OPENMP=1, I may need openblas_set_num_threads(num_threads) in the source code https://rdrr.io/github/wrathematics/openblasctl/man/openblas_set_num_threads.html#:~:text=threads%20to%20use.-,Details,t%20simply%20call%20R%27s%20Sys.

c) for the time being if there is only lblas and/or with -mkl=sequential, I have to manually configurate dgemm threads (as kind of block decomposition), regardless OMP_NUM_THREADS. That's ok, but I need to use if to control the source code goes in that way, if the source code has lines for a) and b)

The manually programming dgemm threads in c) is somehow universal. When I would like to exploit parallel blas from libraries, things can be complicated it seems such that I don't know how to switch in source code regarding the compiler options.

Addition, OMP_NUM_THREADS from enviroment file, .bashrc, is not preferable. (Sorry I should have mentioned this point earlier) The source code read an input file which specify the number of cores being used, and use omp_set_num_thread to set the targeted number of cores, than from the enviroment file.

Addition2, from my test on MKL, OMP_NUM_THREADS cannot surpress call mkl_set_num_threads. Namely, I have to specify call mkl_set_num_threads to work with -mkl=parallel flag.

AlphaF20
  • 583
  • 6
  • 14
  • 4
    Is there a reason you can't just set the environment variable `OMP_NUM_THREADS`? If not you'll probably need some form of preprocessing. – Ian Bush Dec 16 '21 at 08:23
  • [this question](https://stackoverflow.com/questions/55219012/determine-variable-from-makefile-in-fortran) has some information on passing variables from a makefile to the Fortran preprocessor. – veryreverie Dec 16 '21 at 08:25
  • Agree with @IanBush: most modern BLAS libraries obey the OpenMP environment variables. It is in general not necessary to set the number of threads in the source. – Victor Eijkhout Dec 16 '21 at 12:32
  • 1
    About the only gotcha with the `OMP_NUM_THREADS` approach I've had to deal with is the stupid default most implementations have of using all the cores when the variable is *not* set. In that case you can use `get_environment_variable` to check for `OMP_NUM_THREADS` and if it isn't there use `omp_set_num_threads` to set the default number of threads to a sensible value, i.e. 1 - all of which is potable. – Ian Bush Dec 16 '21 at 14:00
  • I will check only use `omp_set_num_threads` without `mkl_set_num_threads` / `openblas_set_num_threads`. I think I tried at some stage and didn't remember the result. The thing is, for the time if I only use `lblas`, seems `lblas` does not support parallel BLAS, and I need to write `DGEMM` by specifying threads and the associated matrix dimensions; for `lopenblas`/`mkl`, I can set auto parallel `DGEMM`. Thus, I still need some `ifndef` / `ifdef` – AlphaF20 Dec 16 '21 at 14:35
  • 1
    I really have no idea what you mean. Is it that you have your own openmp threaded BLAS if MKL or OPENBLAS are not available? If that is the case `OMP_NUM_THREADS` will be respected - use *that* as the primary way to set the number of threads you use, only use `omp_set_num_threads` if you are paranoid about the case when `OMP_NUM_THREADS` is not set i.e. most codes never bother about it. In summary if you use the environment variable it is almost always possible to write code that does *not* need preprocessing, irrespective of the BLAS implementation. – Ian Bush Dec 16 '21 at 14:45
  • I agree with Ian, I do not see any need for preprocessing or alternative submodules and for calling all those set_ routines. Not from your description anyway. It may be necessary in some strange circumstances, but normally it is pointless. But your "seems lblas does not support parallel BLAS, and I need to write DGEMM by specifying threads and the associated matrix dimensions" is not very clear. `OMP_NUM_THREADS`, `MKL_NUM_THREADS` and `OPENBLAS_NUM_THREADS` (check the manual for the exact names) should do it in normal circumstances. – Vladimir F Героям слава Dec 17 '21 at 07:50
  • Reading the number of threads from a configuration file is your deaign decisiin, but quite a strange one. It is quite inflexible. With environment variables you are not limited to.bashrc, you can change it freely for wvery executed command. Just use `export`, or indeed juat for the aingle execution use `OMP_NUM_THREADS=2 ./mycommand`. With config files you are setring traps for your users or at least make it inconvenient for them. – Vladimir F Героям слава Dec 17 '21 at 07:54

1 Answers1

4

There are at least two approaches to this.

Preprocessor variables

As explained in e.g. this question and this question, among others, you can pass variables from a Makefile directly to an appropriate preprocessor.

For example, in the branches of the Makefile where you set -mkl=parallel you could also set -DMKL_PARALLEL. Then, in your source code you could have a block which looks something like

#ifdef MKL_PARALLEL
  call mkl_set_num_threads(numthreads)
#endif

Provided you compile your code with an appropriate preprocessor, this allows you to pass arbitrary information from your Makefile to your source code.

Separate files

Instead of using a preprocessor, you can have multiple copies of the same file, each with a different set of options, and only compile the correct file for the project.

A slightly nicer way of doing this is to have one module file, which is always the same regardless of options, and multiple submodules, each of which contains one set of options. This reduces the room for error arising from multiple files, and reduces compilation time if you need to change the options.

veryreverie
  • 2,871
  • 2
  • 13
  • 26