I am trying to get more details on the RDMA read and write semantics (especially data placement semantics) and I would like to confirm my understanding with the experts here.
- RDMA read :
Would the data be available/seen in the local buffer, once the RDMA read completion is seen in the completion queue. Is the behavior the same, if I am using GPU Direct DMA and the local address maps to GPU memory. Would the data be immediately available in GPU, once the RDMA READ completion is seen in completion queue. If it is not immediately available, what operation will make ensure it.
- RDMA Write with Immediate (or) RDMA Write + Send:
Can the remote host check for presence of data in its memory, after it has seen the Immediate data in receive queue. And is the expectation/behavior going to change, if the Write is to GPU memory (using GDR).