2

My cluster utilizes MVAPICH2 over Infiniband FDR and and I am considering the use of RDMA for my simulations. I am aware of the MPI_Put and MPI_Get calls for explicitly invoking RDMA operations, however I would like to know if this is the only way to use RDMA within MPI.

My current implementation involves channel semantics (send/receive) for communication, along with MPI_Reduce and MPI_Gatherv. I know that MVAPICH2 has configuration paramaters that can be used to enable RDMA. If a program using MPI has send/receive calls and RDMA is enabled, does MPI automatically convert from channel semantics over to memory semantics (put/get) or is the explicit use of MPI_Put and MPI_Get the only method for implementing RDMA in MVAPICH2?

MPI_Send requires a corresponding MPI_Receive, whether they are blocking or non-blocking doesnt matter as a send must be met with a receive. RDMA does not have this requirement and instead only implements either MPI_Put (write to remote memory) or MPI_Get (read from remote memory). I am trying to find out if enabling rdma while still using send and receives, allows MVAPICH2 to somehow automatically convert the send/receives into the appropriate rdma call.

alfC
  • 14,261
  • 4
  • 67
  • 118
Tokth
  • 31
  • 4
  • Welcome, can you add an example? – alfC Mar 10 '18 at 05:58
  • 1
    There really is no "example" of this this. Either MVAPICH2 will use RDMA or it wont for the utilized constructs, and that is the question. – Tokth Mar 12 '18 at 02:42
  • An example makes easier to test. – alfC Mar 12 '18 at 02:43
  • 1
    There is nothing to "test" here. Either MVAPICH2 will use RDMA in place of send/receive when RDMA is enabled or it wont unless put/get is explicitly used. – Tokth Mar 13 '18 at 13:50
  • I am not an expert in MPI, but there is no reason to think that RDMA is different from `MPI_Send` and `MPI_Receive`, which may well be how `MPI_Put` and `MPI_Get` are implemented and run (in a dedicated thread) behind the scenes. Maybe Put/Get can exploit some special hardware in a special system, but that's it. I don't know mvapich but openmpi has the command `ompi_info --all` from which you can extract a lot of information on how the system is configured. – alfC Mar 13 '18 at 20:42
  • This was a very good question since OpenSHMEM is the only thing that explicitly does gather operations using RDMA. Thanks for asking it. – hoodaticus Aug 16 '18 at 18:07

1 Answers1

1

If MVAPICH2 has been built with the correct options, it will use RDMA for all MPI operations including MPI_Send and MPI_Recv on supported hardware, which includes InfiniBand. So, you do not need to use MPI_Put/Get to take advantage of RDMA-capable hardware. In fact, using MPI_Send/Recv might be faster because they are often better optimized.

MPI libraries use various designs to translate MPI_Send/Recv operations to RDMA semantics. The details can be found in the literature.

Sourav
  • 379
  • 7
  • 13