If I would like parallelize my programme with effectively more than 17
threads, I have to recode my programme into an MPI programme ?
Yes, you would need to write some MPI code in order to exploit the nodes at your disposal.
OpenMP targets shared-memory architectures and you need a message passing library in order to communicate between the nodes.
Parallelizing distributed architectures is different (you cannot make a for loop parallelization as in OpenMP) since each node has its own shared-memory and there is no way that one node knows the state of other nodes in order to synchronize the work. You have to do that yourself.
I would like to ask how difficult it is to convert a OpenMP programme
into an MPI programme?
The MPI parallelization can be quite straightforward depending on your application and the way you wrote your code. You should detail your algorithms in order to judge that. The big lines are :
- Embarrasingly parallel problem with static load of work : every MPI node has the same amount of work and do the same job with no or very few interactions with other nodes. If your application enter this category then the parallelization is straighforward and can be done with collective MPI routines. Still, you will need to write and understand how MPI work.
- More complex parallel problem / dynamic load of work : your problem needs synchronization, some communications between your nodes and/or the amount of work is unknown and you need a load balancing strategy. This is what HPC dudes do for a living :)
I hope you enter in the first category !
In the end, the fun starts here, in order to have a good speedup, you will need to find compromises and play with things since you will have an hybrid OpenMP/MPI parallelization.