How to properly malloc struct A
with cuda?
struct B
{
int* pointerToInt;
int arraySize;
};
struct A
{
B* pointerToB;
int arraySize;
};
How to properly malloc struct A
with cuda?
struct B
{
int* pointerToInt;
int arraySize;
};
struct A
{
B* pointerToB;
int arraySize;
};
If allocating on host memory, we can think of doing:
struct A* h_A;
h_A = malloc(sizeof(struct A));
h_A->arraySize = 10;
h_A->pointerToB = malloc(10 * sizeof(struct B));
for (int i = 0; i < 10; i++) {
struct B h_B = (h_A->pointerToB)[i];
h_B.arraySize = i + 5;
h_B.pointerToInt = malloc((i + 5) * sizeof(int));
}
If we try to do similar stuff with cudaMalloc
:
struct A* d_A;
cudaMalloc(&d_A, sizeof(struct A));
d_A->arraySize = 10; /*** error ***/
cudaMalloc(&(d_A->pointerToB), 10 * sizeof(struct B)); /*** error ***/
...
we'll encounter segmentation fault error because we are trying to dereference d_A
which is allocated in device memory. We can't access device memory from host code using dereferencing operator.
One possible solution is to allocate device memory for struct B
inside your device code. You can use malloc
or free
in device code to allocate device memory dynamically. See this section B.20. Dynamic Global Memory Allocation and Operations on CUDA Programming Guide
Flattening your 2D array into 1D array might be a better solution.