2

For example i can use CUDA atomic operations atomicAdd(ptr, val), atomicCAS(ptr, old, new), ... on its global memory (GPU-RAM). With CUDA 6.5.

But can I use these atomic-operations for the remote global memory over GPUDirect 2.0 P2P?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
Alex
  • 12,578
  • 15
  • 99
  • 195
  • 4
    If the GPU that performs the atomic operation is the only processor that accesses the memory location, atomic operations on the remote location can be *seen* correctly by the GPU. If other processors are accessing the location, no. There would be no guaranty for the consistency of values across multiple processors. – Farzad Jan 18 '15 at 20:16
  • @Farzad Thank you! Is this because P2P works over non-cache-coherent bus - PCI Express? – Alex Jan 19 '15 at 09:56
  • 1
    The main issue here is that, AFAIK, GPU keeps atomics coherent at the L2 cache so any transactions from GPU threads to the outside world (GPU global memory, host memory, peer memory, ...) is seen atomic by the GPU. If the atomic position is manipulated by any other cause, like another GPU, L2 cache is not aware of that hence atomic operation is disrupted. – Farzad Jan 19 '15 at 10:11
  • @Farzad Thank you very much. I was misled by phrase "Data cached in L2 of the target GPU" on the page-10: "Peer-to-Peer Communication Between GPUs": http://on-demand.gputechconf.com/gtc-express/2011/presentations/cuda_webinars_GPUDirect_uva.pdf – Alex Jan 19 '15 at 10:22

1 Answers1

2

No. The GPU atomics are only atomic across the GPU performing the operation. They do not work on host memory or nonlocal device memory.

I'm sure it is a roadmap item for NVIDIA to address these limitations on future platforms, esp. with NVLink.

ArchaeaSoftware
  • 4,332
  • 16
  • 21
  • There is also optional support for some atomic operations on PCIe 3 bus, they can be device-to-device with modern bridges. http://www.csit-sun.pub.ro/~cpop/Documentatie_SMP/Standarde_magistrale/PCIexpress/PCIe3_Accelerator-Features_WP.pdf - "Atomic Read-Modify-Write Transactions: ... IO->IO" – osgx Aug 19 '16 at 08:39