RDMA Scatter/Gather is a nice way to consolidate data transfers. For example, verbs API allows data at multiple locations to be written in a remote buffer with a SINGLE RDMA write operation; or, data in a remote buffer could be read to multiple locations with a SINGLE RDMA read operation.
However, I can not initiate an RDMA operation writing to multiple locations on the remote side (or reading from multiple locations on the remote side). This feature is appealing to us because it efficiently uses the wide RDMA lanes for multiple small writes. I also checked the Intel qsm APIs and the Cray gni APIs. It seems no one support such a feature--let's call it "writer-controlled remote scatter". Is there a deep reason this is not supported?