Malloc structure of array of structure of array in CUDA

Question

How to properly malloc struct A with cuda?

struct B
{
    int* pointerToInt;
    int arraySize;
};
struct A
{
    B* pointerToB;
    int arraySize;
};

[This answer](https://stackoverflow.com/a/31135377/1231073) describes a way to allocate `structs` on device. — sgarizvi, Feb 01 '18 at 06:46

nglee · Answer 1 · 2018-02-01T06:50:27.107

If allocating on host memory, we can think of doing:

struct A* h_A;
h_A = malloc(sizeof(struct A));
h_A->arraySize = 10;
h_A->pointerToB = malloc(10 * sizeof(struct B));
for (int i = 0; i < 10; i++) {
    struct B h_B = (h_A->pointerToB)[i];
    h_B.arraySize = i + 5;
    h_B.pointerToInt = malloc((i + 5) * sizeof(int));
}

If we try to do similar stuff with cudaMalloc:

struct A* d_A;
cudaMalloc(&d_A, sizeof(struct A));
d_A->arraySize = 10;                                       /*** error ***/
cudaMalloc(&(d_A->pointerToB), 10 * sizeof(struct B));     /*** error ***/
...

we'll encounter segmentation fault error because we are trying to dereference d_A which is allocated in device memory. We can't access device memory from host code using dereferencing operator.

One possible solution is to allocate device memory for struct B inside your device code. You can use malloc or free in device code to allocate device memory dynamically. See this section B.20. Dynamic Global Memory Allocation and Operations on CUDA Programming Guide

Flattening your 2D array into 1D array might be a better solution.

Malloc structure of array of structure of array in CUDA

1 Answers1