1

I have a problem with using of pointer to pointer in cuda. Code snippet is below.

char** d_ppcPtr, *d_pcPtr, *h_pcPtr;
cudaMalloc(&d_ppcPtr, sizeof(char*) * 10);

h_pcPtr = (char*)malloc(sizeof(char) * 100);
for(int i = 0; i < 10; i ++)
{
      cudaMalloc(&d_pcPtr, sizeof(char) * 100);
      cudaMemset(d_pcPtr, 1, sizeof(char) * 100);
      cudaMemcpy(&d_ppcPtr[i], &d_pcPtr, sizeof(char*), cudaMemcpyHostToDevice);
      cudaMemcpy(h_pcPtr, d_ppcPtr[i], sizeof(char) * 100, cudaMemcpyDeviceToHost); //crash here
      cudaFree(d_ppcPtr[i]); //crash also here
}
cudaFree(d_ppcPtr);

how can i fix above two crashes? Thanks in advance.

Wang Wang
  • 115
  • 1
  • 9
  • What kind of crash? Segfault? Error messages? – skrrgwasme Sep 30 '14 at 02:49
  • access violation reading at d_ppcPtr[i] address. – Wang Wang Sep 30 '14 at 02:53
  • Have you allocated the memory space to which `d_ppcPtr[i]` is supposed to point? – Vitality Sep 30 '14 at 06:46
  • from cudaMalloc(&d_ppcPtr, sizeof(char*) * 10); d_ppcPtr[i] is initiated and it's value is assigned from the line cudaMemcpy(&d_ppcPtr[i], &d_pcPtr, sizeof(char*), cudaMemcpyHostToDevice); besides watching and freeing, this pointer works well – Wang Wang Sep 30 '14 at 06:51
  • please refer to this link http://stackoverflow.com/questions/15113960/cuda-allocating-array-of-pointers-to-images-and-the-images – Wang Wang Sep 30 '14 at 06:59
  • 1
    The pointers used as arguments of `cudaMemcpy` should physically reside on the host, while `d_ppcPtr[i]` resides on the device. – Vitality Sep 30 '14 at 07:24
  • @JackOLantern, have you seen the link i offered? At there, Rovert Crovella answered like me and also it(d_ppcPtr[i]) works good in device code, but not in host code. This is what i want to know. – Wang Wang Sep 30 '14 at 13:12
  • There is a difference between your code and that by Robert Crovella in the answer to the post you have linked to. There, `temp[i]` is a pointer that resides on the host and that points to a device memory space. Here, `d_ppcPtr[i]` resides on the device. – Vitality Sep 30 '14 at 14:03
  • Here, what is corresponding to temp[i] at here? It's just d_pcPtr, not d_ppcPtr[i]. – Wang Wang Sep 30 '14 at 14:15
  • i can't enter chat room due to my low reputation, Sorry Jack – Wang Wang Sep 30 '14 at 14:25

1 Answers1

4

The following modification will "fix" your code (fully worked example, including host and device verification):

$ cat t583.cu
#include <stdio.h>

__global__ void testkernel(char **data, unsigned n){
  for (int i = 0; i < 100; i++) if (data[n][i] != 1) printf("kernel error\n");
}

int main(){
  char** d_ppcPtr, *d_pcPtr, *h_pcPtr;
  cudaMalloc(&d_ppcPtr, sizeof(char*) * 10);

  h_pcPtr = (char*)malloc(sizeof(char) * 100);
  for(int i = 0; i < 10; i ++)
  {
      cudaMalloc(&d_pcPtr, sizeof(char) * 100);
      cudaMemset(d_pcPtr, 1, sizeof(char) * 100);
      cudaMemcpy(&d_ppcPtr[i], &d_pcPtr, sizeof(char*), cudaMemcpyHostToDevice);
      memset(h_pcPtr, 0, sizeof(char)*100);
      testkernel<<<1,1>>>(d_ppcPtr, i);
      cudaMemcpy(h_pcPtr, d_pcPtr, sizeof(char) * 100, cudaMemcpyDeviceToHost);
      cudaFree(d_pcPtr);
      for (int i = 0; i < 100; i++) if (h_pcPtr[i] != 1) printf("Error!");
  }
  cudaFree(d_ppcPtr);
}
$ nvcc -arch=sm_20 -o t583 t583.cu
$ cuda-memcheck ./t583
========= CUDA-MEMCHECK
========= ERROR SUMMARY: 0 errors

Note that conceptually, there is no difference between my code and yours, because the pointer that you are attempting to use in location d_ppcPtr[i], (and is crashing, because it is located on the device,) is already contained in d_pcPtr, which is on the host.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Thank you so much, Robert Crovella. Explicit answer! – Wang Wang Oct 01 '14 at 07:50
  • 1
    Shouldn't `cudaMemcpy(&d_ppcPtr[i], &d_pcPtr, sizeof(char*), cudaMemcpyHostToDevice);` use `cudaMemcpyDeviceToDevice`? Both d_ppcPtr and d_pcPtr were `cudaMalloc`ed, which means they're both on the device. Or am I missing something? – johnny_be Dec 01 '17 at 00:37
  • 1
    Yes, you are missing something, `d_pcPtr` was indeed `cudaMalloc`ed, but the transfer here is not from `d_pcPtr` but from `&d_pcPtr`, which I guarantee you is a host pointer. If the transfer direction were wrong, `cuda-memcheck` would have also spit out an error. If this is not clear, let's not try to sort it out here in the comments. Feel free to experiment with it yourself. This is a deep-copy operation, which you may not be grasping. – Robert Crovella Dec 01 '17 at 01:15