Modifying device memory on CUDA only once

Question

I' m new to CUDA. I allocated memory on device for float variable. Then I added many computed values to it in kernel function. Now I want to do only one math operation on this variable. Do I have to copy it back to host in order to do it?

If you want to perform some maths on the host, then yes, you'll need the relevant data on the host. — Oliver Charlesworth, Dec 25 '13 at 23:27
I want to modify it once and then perform another parallel computation on device. I don't like the idea of cudaMemcpy'ing form device to host and vice versa many times. — Tomasz Posłuszny, Dec 25 '13 at 23:34
Not exactly sure what you're asking, but you can run any number of kernels on the same values in device memory without copying them back to the host. — Roger Dahl, Dec 26 '13 at 00:01

score 2 · Accepted Answer · edited May 23 '17 at 10:32

If you have only a single variable or a small amount of data, you might want to consider using zero-copy data (variables) on the host in pinned memory, that are also accessible on the device.

When the device accesses these variables, transactions will be generated across PCIE to supply the values on the device, and then to update the values on the host.

So this isn't really eliminating the copies, as you can see. But it may be of interest for your application, if only a small amount of data is involved.

The simple Zero Copy CUDA sample outlines the method.

My answer here also gives a simple example of using zero copy memory for a single variable on both the device and the host.

Modifying device memory on CUDA only once

1 Answers1