1

I need a clarification about the code given in the answer to the topic: CUDA cudaMemcpy Struct of Arrays.

Both dev_s and dev_arr1(2 and 3) are allocated in the device. Why is the flag cudaMemcpyHostToDevice used? It should be cudaMemcpyDeviceToDevice.

The code is reported below.

// NOTE: Binding pointers with dev_s
cudaMemcpy(&(dev_s->arr1), &dev_arr1, sizeof(dev_s->arr1),cudaMemcpyHostToDevice);
cudaMemcpy(&(dev_s->arr2), &dev_arr2, sizeof(dev_s->arr2),cudaMemcpyHostToDevice);
cudaMemcpy(&(dev_s->arr3), &dev_arr3, sizeof(dev_s->arr3),cudaMemcpyHostToDevice);
talonmies
  • 70,661
  • 34
  • 192
  • 269
horus
  • 91
  • 1
  • 9
  • Is it C++? Please tag accordingly – Vadim Kotov Apr 05 '18 at 09:15
  • Yes. But it is the same for C. See https://devblogs.nvidia.com/unified-memory-in-cuda-6/ . The function launch contains this line: cudaMemcpy(&(d_elem->name), &d_name, sizeof(char*), cudaMemcpyHostToDevice). But both d_elem and d_name are allocated inside the device by means of a cudamalloc. – horus Apr 05 '18 at 10:12

1 Answers1

2

Both dev_s and dev_arr1(2 and 3) are allocated in the device.

Correct.

Why is the flag cudaMemcpyHostToDevice used? It should be cudaMemcpyDeviceToDevice.

Incorrect.

That code is copying the pointer values of dev_arr1, dev_arr2, and dev_arr3 from the host to the device. The addresses themselves are addresses in GPU memory, but the address values are stored in host memory, not device memory.

talonmies
  • 70,661
  • 34
  • 192
  • 269