My application performs one-to-one one-sided communications (every machine has active communications with all the other machines)
I am observing performance bottlenecks in network bandwidth, and concerning moving some parts of communications to collective calls if I can reduce bandwidth usage.
What if I use MPI collectives instead of one-sided communication calls? Can it reduces the total network bandwidth utilization? It will depend on the implementation of MPI; (I am using Intel MPI over Mellanox Infiniband.)
If Infiniband's RDMA supports bandwidth-efficient broadcast or multicast functionality, MPI will directly benefit from that.
The following is a part of my current usage of one-sided communication, which can be changed into MPI_BCast by defining sub-groups.
In each process,
For i in [1, ..., k]
MPI_RGet (buf[i], my_rank + i);
Thanks