Why can MPI
(by which I will mean OpenMPI throughout this post) programs not be executed like any other, and instead have to be executed using mpirun
?
In other words, why does MPI
not simply provide headers/packages/... that you can import and then let you be master in your own house, by letting you use MPI
when and where you want, in your sourcecode, and allowing you to compile your own parallel-processing-included-executables?
I'm really a novice, but for example, I feel like the -np
argument passed to mpirun
could easily be fixed in the sourcecode, or could be prompted for by the program itself, or could be read in from a configuration file, or could be simply configured to use all available cores, whose number will be determined by a surrounding scheduler script anyway, or ....
(Of course, you can argue that there is a certain convenience in having mpirun
do this automatically in some sense, but that hardly justifies, in my view, taking away the coder's possibility to write his own executable.)
For example, I really have little experience, but in Python you can do multiprocessing by simply calling functions of the multiprocessing
module and then running your script like any other. Of course, MPI
provides more than Python's multiprocessing
, but if MPI
for example has to start a background service, then I still don't understand why it can't do so automatically upon calls of MPI
functions in the source.
For another possibly stupid example, CUDA
programs do not require cudarun
. And for a good reason, since if they did, and if you used both CUDA
and MPI
in parts of your program, you would now have to execute cudarun mpirun ./foo
(or possibly mpirun cudarun ./foo
) and if every package worked like this, you would soon have to have a degree in computer science to simply execute a program.
All of this is maybe super important, as you can simply ship each of your MPI executables with a corresponding wrapper script, but this is kind of annoying and I would still be interested in why this design choice was made.