I referred to this post: loading an array of structs with arrays onto cuda
In the code given in the above post, cpuPointArray (array of structs) is allocated memory on the CPU using malloc, whereas the struct members (float* c and float* d) are allocated memory using cudamalloc.
Can someone please explain why this is being done.
Also, I don't understand why in the following loop (from the above link), cpuPointArray is being used to copy the data from device to host. When I replaced cpuPointArray with gpuPointArray, I get segmentation fault. Using cuda-gdb, I found the gpuPointArray.c is NULL. Can someone please explain why should it be NULL:
for (int k=0; k<16; k++){
printf("creating memory on cpu for array c\n");
outPointArray[k].c = (float*)malloc(16*sizeof(float));
printf("creating memory on cpu for array d\n");
outPointArray[k].d = (float*)malloc(16*sizeof(float));
printf("copying memory values onto cpu array c\n");
err = cudaMemcpy(outPointArray[k].c, cpuPointArray[k].c, 16*sizeof(float), cudaMemcpyDeviceToHost);
checkerror(err, "copy array c from gpu to cpu");
printf("copying memory values onto cpu array c\n");
err = cudaMemcpy(outPointArray[k].d, cpuPointArray[k].d, 16*sizeof(float), cudaMemcpyDeviceToHost);
checkerror(err, "copy array d from gpu to cpu");
printf("bottom of loop %d\n", k);
}