CUDA class with multidimensional pointers

Question

I have been struggling with this class implementation now for quite a while and hope someone can help me with it.

class Material_Properties_Class_device 
{

public:
int max_variables;
Logical * table_prop;
Table_Class ** prop_table;
};

The implementation for the pointers looks like this

Material_Properties_Class **d_material_prop = new Material_Properties_Class* [4];
Logical *table_prop;

for (int k = 1; k <= 3; k++ )
{ 
cutilSafeCall(cudaMalloc((void**)&(d_material_prop[k]),sizeof(Material_Properties_Class)));  
cutilSafeCall(cudaMemcpy(d_material_prop[k], material_prop[k], sizeof(Material_Properties_Class ), cudaMemcpyHostToDevice)); 
}

for( int i = 1; i <= 3; i++ )
{   
cutilSafeCall(cudaMalloc((void**)&(table_prop), sizeof(Logical)));
cudaMemcpy(&(d_material_prop[i]->table_prop), &(table_prop), sizeof(Logical*),cudaMemcpyHostToDevice);
cudaMemcpy(table_prop, material_prop[i]->table_prop, sizeof(Logical),cudaMemcpyHostToDevice);
}

cutilSafeCall(cudaMalloc((void ***)&material_prop_device, (4) * sizeof(Material_Properties_Class *)));  
cutilSafeCall(cudaMemcpy(material_prop_device, d_material_prop, (4) * sizeof(Material_Properties_Class *), cudaMemcpyHostToDevice));

This implementation works but it can't get it working for the **prop_table. I assume it must somehow follow the same principle but I just can't get my head around it.

I have already tried

Table_Class_device **prop_table =  new Table_Class_device*[3];

and insert another loop inside the second for loop

for (int k = 1; k <= 3; k++ )
        { 
            cutilSafeCall(cudaMalloc((void**)&(prop_table[k]), sizeof(Table_Class))); 
            cutilSafeCall(cudaMemcpy( prop_table[k], material_prop[i]->prop_table[k], sizeof( Table_Class *), cudaMemcpyHostToDevice)); 
        }

Help would be much appriciated

score 1 · Answer 1 · edited May 23 '17 at 12:11

1

This question comes up frequently. Multidimensional pointers are especially challenging.

If possible, it's recommended that you flatten multidimensional pointer usage (**) to single-dimensional pointer usage (*), and as you've seen, even that is somewhat cumbersome.

The single-dimensional case (*) is further described here. Although you seem to have already figured it out.

If you really want to handle the 2 dimensional (**) case, look here.

An example implementation for 3 dimensional case (***) is here. ("madness!")

Working with 2 and 3 dimensions this way is quite difficult. Thus the recommendation to flatten.

edited May 23 '17 at 12:11

Community

1
1

answered May 22 '13 at 15:52

Robert Crovella

143,785
11
213
257

Cheers for that, I know it is a pain but I can't flatten it as I have to integrate my code into an existing serial code which uses a lot of these structures for various access purposes. I will post the result for my simple class if I figure it out. – ThatQuantDude May 22 '13 at 16:41

score 1 · Accepted Answer · answered May 22 '13 at 16:55

some magic. May be it'll help

struct fading_coefficient
{
    double* frequency_array;
    double* temperature_array;
    int frequency_size;
    int temperature_size;
    double** fading_coefficients;
};

struct fading_coefficient* cuda_fading_coefficient;
    double* frequency_array = NULL;
    double* temperature_array = NULL;
    double** fading_coefficients = NULL;
    double** fading_coefficients1 = (double **)malloc(fading_coefficient->frequency_size * sizeof(double *));   

    cudaMalloc((void**)&frequency_array,fading_coefficient->frequency_size *sizeof(double));
    cudaMemcpy( frequency_array, fading_coefficient->frequency_array, fading_coefficient->frequency_size *sizeof(double), cudaMemcpyHostToDevice );
    free(fading_coefficient->frequency_array);

    cudaMalloc((void**)&temperature_array,fading_coefficient->temperature_size *sizeof(double));
    cudaMemcpy( temperature_array, fading_coefficient->temperature_array, fading_coefficient->temperature_size *sizeof(double), cudaMemcpyHostToDevice );
    free(fading_coefficient->temperature_array);

    cudaMalloc((void***)&fading_coefficients,fading_coefficient->temperature_size *sizeof(double*));

    for (int i = 0; i < fading_coefficient->temperature_size; i++)
    {
        cudaMalloc((void**)&(fading_coefficients1[i]),fading_coefficient->frequency_size *sizeof(double));
        cudaMemcpy( fading_coefficients1[i], fading_coefficient->fading_coefficients[i], fading_coefficient->frequency_size *sizeof(double), cudaMemcpyHostToDevice );
        free(fading_coefficient->fading_coefficients[i]);
    }
    cudaMemcpy(fading_coefficients, fading_coefficients1, fading_coefficient->temperature_size *sizeof(double*), cudaMemcpyHostToDevice );

    fading_coefficient->frequency_array = frequency_array;
    fading_coefficient->temperature_array = temperature_array;
    fading_coefficient->fading_coefficients = fading_coefficients;

    cudaMalloc((void**)&cuda_fading_coefficient,sizeof(struct fading_coefficient));
    cudaMemcpy( cuda_fading_coefficient, fading_coefficient, sizeof(struct fading_coefficient), cudaMemcpyHostToDevice );

Hi T_T I implemented your example with an object which contains some data, but I didn't get it working , the fading_coefficients didn't contain any data; I still don't get my head around it. Why do you for example free the data again and copy it back again in the 3 line from the bottom ???? thanks in advance — ThatQuantDude, May 23 '13 at 22:14
I free RAM mamory at free(fading_coefficient->fading_coefficients[i]); At fading_coefficient->fading_coefficients = fading_coefficients; I assing to fading_coefficient->fading_coefficients pointer to the video memory. — T_T, May 24 '13 at 04:54
cheers I got it working, i will have to change it now to support an array of struct objects. thanks again cost me a lot of time and nerves so far — ThatQuantDude, May 24 '13 at 14:55
Here is my extension of T_T's example to a multi dimensional array of structs containing **arrays. http://pastebin.com/G2PKsJtu — ThatQuantDude, May 29 '13 at 12:39
free(fading_coefficient->frequency_array); Why DO I need to do that? — erogol, Jun 06 '13 at 13:12
This code added here only to free memory. In example, struct in video memory is assembled from fading_coefficient in RAM. If you create separate struct for access from cuda code you do not need to clear memory here. — T_T, Jun 06 '13 at 13:54

CUDA class with multidimensional pointers

2 Answers2