Why do OpenMPI programs have to be executed using `mpirun`?

Question

Why can MPI(by which I will mean OpenMPI throughout this post) programs not be executed like any other, and instead have to be executed using mpirun?

In other words, why does MPI not simply provide headers/packages/... that you can import and then let you be master in your own house, by letting you use MPI when and where you want, in your sourcecode, and allowing you to compile your own parallel-processing-included-executables?

I'm really a novice, but for example, I feel like the -np argument passed to mpirun could easily be fixed in the sourcecode, or could be prompted for by the program itself, or could be read in from a configuration file, or could be simply configured to use all available cores, whose number will be determined by a surrounding scheduler script anyway, or .... (Of course, you can argue that there is a certain convenience in having mpirun do this automatically in some sense, but that hardly justifies, in my view, taking away the coder's possibility to write his own executable.)

For example, I really have little experience, but in Python you can do multiprocessing by simply calling functions of the multiprocessing module and then running your script like any other. Of course, MPI provides more than Python's multiprocessing, but if MPI for example has to start a background service, then I still don't understand why it can't do so automatically upon calls of MPI functions in the source.

For another possibly stupid example, CUDA programs do not require cudarun. And for a good reason, since if they did, and if you used both CUDA and MPI in parts of your program, you would now have to execute cudarun mpirun ./foo (or possibly mpirun cudarun ./foo) and if every package worked like this, you would soon have to have a degree in computer science to simply execute a program.

All of this is maybe super important, as you can simply ship each of your MPI executables with a corresponding wrapper script, but this is kind of annoying and I would still be interested in why this design choice was made.

so let's say you want to run on two nodes `A` and `B`, and you simply `a.out -np 2 --host A,B` on node `A`, how should the second MPI task be spawned on node `B` ? — Gilles Gouaillardet, Oct 24 '17 at 08:24
You are running `a.out` (with your command line arguments) on the same machine/node as you would run `mpirun a.out`, so they should have the same possibilities to create tasks and processes, right? It might require some messy work within `a.out` to do this, but this work is currently done in `mpirun` anyway, so why can it not be encapsuled in a call to some `MPI`-library function within `a.out`? — Bananach, Oct 24 '17 at 08:33
@GillesGouaillardet (that comment was for you, let me notify you about it) — Bananach, Oct 24 '17 at 09:01
when and how should `a.out` on node `A` spawn `a.out` on node `B` ? — Gilles Gouaillardet, Oct 24 '17 at 13:57

score 2 · Accepted Answer · answered Oct 25 '17 at 00:43

You can spin up processes however you like, you'll need to have some channel to send port information between processes, a command line arg works. I've had to spin up processes manually, but it's far easier and less painful to use a preconstructed communicator. If you have a good reason, you can do it though.

I have a question where I edited a minimal complete example into the question. The key calls are MPI_Open_port, MPI_Comm_accept, MPI_Comm_connect, and MPI_Intercomm_merge. You have to merge the connecting nodes one at a time. If you want to go after this, be sure you have a good idea about the difference between an inter and intracommunicator. Here's the example for you: Trying to start another process and join it via MPI but getting access violation

Why do OpenMPI programs have to be executed using `mpirun`?

1 Answers1