Your question is a generalised version of this one. There are at least three possible solutions.
With most MPI implementations it is possible to start multiple executables with their own environments (contexts) as part of the same MPI job. It is called MPMD (Multiple Programs Multiple Data) or MIMD (Multiple Instructions Multiple Data) model. The syntax usually involves :
(colon) as a separator:
$ mpiexec <global parameters>
-n n1 <local parameters> executable_1 <args1> :
-n n2 <local parameters> executable_2 <args2> :
...
-n nk <local parameters> executable_k <argsk>
It launches n1
ranks running executable_1
with command-line arguments <args1>
, n2
ranks running executable_2
with command-line arguments <args2>
, and so on. In total n1 + n2 + ... + nk
processes are started and ranks are assigned linearly:
Ranks (from .. to) | Executable
====================|=============
0 .. n1-1 | executable_1
n1 .. n1+n2-1 | executable_2
n1+n2 .. n1+n2+n3-1 | executable_3
... | ...
As a more narrow case, the same executable could be specified k times in order to get k different contexts with the same executable. <local parameters>
could include setting the values of specific environment variables, e.g. in your case that could be OMP_NUM_THREADS
. The exact method to specify the environment differs from one implementation to another. With Open MPI, one would do:
mpiexec --hostfile all_hosts \
-n 5 -x OMP_NUM_THREADS=2 myprog : \
-n 4 -x OMP_NUM_THREADS=4 myprog : \
-n 6 -x OMP_NUM_THREADS=1 myprog
That will start 15 MPI ranks on the hosts specified in all_hosts
(a global parameters) with the first five using two OpenMP threads, the next four - four OpenMP threads, and the last six running sequentially. With MPICH-based implementations the command would be slightly different:
mpiexec --hostfile all_hosts \
-n 5 -env OMP_NUM_THREADS 2 myprog : \
-n 4 -env OMP_NUM_THREADS 4 myprog : \
-n 6 -env OMP_NUM_THREADS 1 myprog
Although widely supported, the previous method is a bit inflexible. What if one would like e.g. all ranks except every 10-th run sequentially? Then the command line becomes:
mpiexec ...
-n 9 -x OMP_NUM_THREADS=1 myprog : \
-n 1 -x OMP_NUM_THREADS=N myprog : \
-n 9 -x OMP_NUM_THREADS=1 myprog : \
-n 1 -x OMP_NUM_THREADS=N myprog : \
...
A more convenient solution would be to provide a wrapper that sets OMP_NUM_THREADS
based on the process rank. For example, such a wrapper for Open MPI looks like:
#!/bin/bash
if [ $((($OMPI_COMM_WORLD_RANK + 1) % 10)) == 0 ]; then
export OMP_NUM_THREADS=N
else
export OMP_NUM_THREADS=1
fi
exec "$*"
and is used simply as:
mpiexec -n M ... mywrapper.sh myprog <args>
The third and least flexible option is to simply call omp_set_num_threads()
from within the program after MPI initialisation but before any parallel regions and set different number of threads based on the rank:
integer :: provided, rank, ierr
call MPI_INIT_THREAD(MPI_THREAD_FUNNELED, provided, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
if (mod(rank, 10) == 0) then
call omp_set_num_threads(N)
else
call omp_set_num_threads(1)
end if
No matter what solution is chosen, process and thread binding becomes a bit tricky and should probably be switched off altogether.