0

I am trying to write my own implementation of reductive search for maximum number. This is my first CUDA C program and I ran into some errors that I cannot understand yet.

my code:

__global__
void reduce (box * d_in_data, int insize, box * d_out_data)
{
    /*
    for example:

    array size = 25200 (size variable, keep in d_in_data)
    gridDim = 252
    blockDim = 10
     */

    uint32_t tid = threadIdx.x;

    //__shared__ box sdata[insize / gridDim.x]; // 25200 / 252 = 100

    extern __shared__ box sdata[]; // 25200 / 252 = 100

    /*spt its 'size per thread' for filling sdata*/

    /* (25200 / 252) / 10 */ /* each thread should be copy 10 elements from d_in_data to sdata  */

    uint32_t spt = (insize / gridDim.x) / blockDim.x;

    box (* sh_ptr)[blockDim.x][spt] = (box (*) [blockDim.x][spt]) &sdata;

    for (int i = 0;i < spt;i++){

        (*sh_ptr)[tid][i] = d_in_data[blockIdx.x * (insize / gridDim.x) + (tid * blockDim.x + i)];

    }
    __syncthreads();
    box temp = (*sh_ptr)[tid][0];
    for (int i = 0;i < spt;i++){
        box i_box = (*sh_ptr)[tid][i];

        temp = temp.s < i_box.s ? i_box : temp;
    }

    (*sh_ptr)[tid][0] = temp;
    __syncthreads();
    box (* d_out_ptr)[gridDim.x][blockDim.x] =  (box (*)[gridDim.x][blockDim.x]) d_out_data;
    (*d_out_ptr)[blockIdx.x][tid]  = (*sh_ptr)[tid][0];
    
}
  1. __shared__ box sdata[insize / gridDim.x]; - why "here cannot be used as a constant" ? how to set the size explicitly?

  2. box (* sh_ptr)[blockDim.x][spt] = (box (*) [blockDim.x][spt]) &sdata; - "error: expression must have a constant value" , "note: attempt to access run-time storage", "note: the value of variable "spt"". How to write correctly in my case?

  3. uint32_t spt = (insize / gridDim.x) / blockDim.x; - "here cannot be used as a constant"

  4. box (* d_out_ptr)[gridDim.x][blockDim.x] = (box (*)[gridDim.x][blockDim.x]) d_out_data; - "error: expression must have a constant value" , "note: attempt to access run-time storage", It's the same here.

Please tell me where I made a mistake in each paragraph.

1 Answers1

2

Your errors have nothing to do with CUDA.

  • __shared__ box sdata[insize / gridDim.x]; - why "here cannot be used as a constant" ? how to set the size explicitly?
    In C/C++ the array size must be known at compile time. Here, insize / gridDim.x would have to be known at compile time, but insize is certainly not.

    extern __shared__ box sdata[];
    

    looks completely OK to me. If you need the size of this array (you certainly do), you can compute it at runtime form function arguments etc.

  • box (* sh_ptr)[blockDim.x][spt] = (box (*) [blockDim.x][spt]) &sdata; - "error: expression must have a constant value" , "note: attempt to access run-time storage", "note: the value of variable "spt"". How to write correctly in my case?
    Here you try to define a pointer to a 2D array. This is possible only if the last dimension of the array is known at compile time, but spt is not known. To circumvent it, you need to calculate the address into the shared memory (sdata) yourself. This is a 1D array and treat it this way.

  • Error number 3 seems to be a diagnostic message, not an error.

  • Error number 4 is of the same type as Error nr 2.

For further reading see, for example, Pointer to Multidimensional Array in C?

zkoza
  • 2,644
  • 3
  • 16
  • 24