1

I am working for a simulation software vendor. We are now starting to implement distributed computing with MPI for our software. I don't really understand how we should distribute our MPI capable software product.

So, MPI is a interface specification, so the actual MPI implementation should be replaceable, right? Whoever runs the cluster can provide a very specialized MPI implementation for the hardware/communication layer they use. This makes sense to me.

On the other hand, when I run ldd mympiapp I see

libmpi.so.12 => /home/mpiuser/mpich-3.2-install/lib/libmpi.so.12 (0x00007fae34684000)

It seems that after building, my application is linked against my specific version of MPI. We already ship our application in different versions for different OSes. Should we now also add combinations for different MPI implementations? Or should we also distribute the shared libraries together with our application? What is expected from the side of users/cluster providers?

I read a lot of web resources, but most stuff I find is written from the standpoint that the one who compiles it also runs it.

thorink
  • 704
  • 1
  • 8
  • 21
  • 1
    [This question](https://stackoverflow.com/questions/38442254/how-to-write-an-mpi-wrapper-for-dynamic-loading) already answers part of your question in the question itself and the answers contain specific hints and source code for an implementation. Please check out all three answers. If it doesn't answer your question, please clarify. – Zulan Nov 14 '17 at 12:09
  • MPI is an interface specification, but different MPI implementations are not binary compatible. You cannot build against Open MPI and then use MPICH at run time. You cannot even build against Open MPI 1.x and run with 2.x. You either have to distribute certain prebuilt MPI library with your code, together with a large number of dependencies, e.g., APIs for various network devices and resource managers, or provide an abstraction layer to be built by the end user. I wouldn't go with the former option. – Hristo Iliev Nov 14 '17 at 15:38
  • @Zulan Thanks for pointing me to [this question](https://stackoverflow.com/questions/38442254/how-to-write-an-mpi-wrapper-for-dynamic-loading)! – thorink Nov 17 '17 at 10:37

2 Answers2

3

There's a reason MPI implementations come with mpicc.

High-performance software differs from ordinary software in that performance is absolutely critical. Compiling a single binary for distribution is generally not acceptable, as hardware abstractions are leaky in terms of high performance.

Many of vendors of large scale high performance software distribute it either via a collection of different binaries for various hardware/software combinations, send an engineer(s) on-site to compile and tune the software for the customer's system, or in some cases I've heard of smaller companies that give the source code to the customer (with very strict contracts).

Three reasons why it needs to be compiled specifically for the customer's system:

  1. So that the correct MPI and OpenMP implementations for the hardware are used,

  2. So that a platform specific compiler can be used to generate the most efficient instructions possible,

  3. So that tuning of compile-time algorithm parameters for the hardware (processors, memory, and interconnect) can be done. The communication pattern your code uses should depend on the interconnect, block sizes should depend on processor cache size, etc..

This need for coupled hardware and compiled bytes generally results in long sales cycles for commercial MPI software.

dlasalle
  • 1,615
  • 3
  • 19
  • 25
0

This issue is similar to any other software you want to ship in binary format.

If you want to support multiple platforms, and multiple OSes you have to provide binary packages. This way (were applicable) you can enforce some requirements (e.g like in RPMs).

You can also provide your binary code packed with libraries compiled for given platform (and make sure to link your binary with these libraries - e.g. using rpath).

There is no simple solution here, as you want to support different platforms, different OSes and (most probably) different compilers. An alternative is to distribute MPI part of your code as source code and providing code you want to "hide" as shared library. But this is oh, so much case dependent.

Oo.oO
  • 12,464
  • 3
  • 23
  • 45
  • MPI is **very different** to other software dependencies. Distributing MPI as part of your software is not a viable option because 1) MPI isn't just a library but depends comes with launchers and interacts with resource managers 2) all of these and MPI are usually highly configured and tuned for a specific system. On the other hand wrapping MPI is viable and established practice as described in detail in the duplicate question I linked. – Zulan Nov 14 '17 at 19:04
  • 1
    some ISV are happy to distribute the IntelMPI runtime with their software. IntelMPI (but not only) can pick the fastest interconnect at runtime in order to provide good performances out of the box. – Gilles Gouaillardet Nov 15 '17 at 00:44