So I have been stuck on this problem for a while. My struct looks like this:
typedef struct
{
int size;
int dim[DIMENSIONS];
float *data;
}matrix;
Now the problem for me is how to malloc and memcpy. This is how I'm doing it:
matrix * d_in;
matrix * d_out;
const int THREADS_BYTES = sizeof(int) + sizeof(int)*DIMENSIONS + sizeof(float)*h_A->_size;
cudaMalloc((void **) &d_in, THREADS_BYTES);
cudaMemcpy(d_in, h_A, THREADS_BYTES, cudaMemcpyHostToDevice);
EDIT: this is how I allocated h_a:
matrix A; // = (matrix*)malloc(sizeof(matrix));
A._dim[0] = 40;
A._dim[1] = 60;
A._size = A._dim[0]*A._dim[1];
A._data = (float*)malloc(A._size*sizeof(float));
matrix *h_A = &A;
Where h_A is a matrix I allocated. I call my kernel like this:
DeviceComp<<<gridSize, blockSize>>>(d_out, d_in);
However, in my kernel I cannot reach any data from the struct, only the array and the variable.