Global function not recognized by CUDA C

Question

I have a very complicated program and i have simplified it in order to make my problem easy to understand: I have a 2 scripts and 1 header: time_analysis.cu, DSMC_kernel_float.cu and DSMC_kernel_float.h;

Here is the time_analysis.cu

#include <cstdlib>
#include <cstdio>
#include <algorithm>
#include <math.h>
#include <cutil.h>
#include <stdio.h>
#include <assert.h>
#include <memory.h>
#include <string.h>
#include <time.h>
#include <cuda_gl_interop.h>
#include <cutil_math.h>
#include "math_constants.h"
#include "vector_types.h"
#include "vector_functions.h"

typedef struct {
int seme;
} iniran;

typedef struct{
int jp1;
int jp2;
float kx;
float ky;
float kz;

} stato_struct;

stato_struct* coll_CPU=0;
stato_struct* coll2dev=0;
stato_struct* coll_GPU=0;

#include "DSMC_kernel_float.h"

//==============================================================
int main(void){
int N_thread = 4;
int ind;
coll_CPU[0].jp1= 0;
coll_CPU[1].jp2= 1;
coll_CPU[2].kx= 2;
coll_CPU[3].ky= 3;
coll_CPU[4].kz= 4;

for(ind=0;ind<=5;ind++){
    coll2dev[ind]=coll_CPU[ind];
}

coll2dev=(stato_struct*) malloc(N_thread*sizeof(stato_struct));

CUDA_SAFE_CALL(cudaMalloc((void**)&coll_GPU, N_thread*sizeof(stato_struct)));
CUDA_SAFE_CALL(cudaMemcpy(coll_GPU,coll2dev,N_thread*sizeof(stato_struct), cudaMemcpyHostToDevice));

CollisioniGPU<<<4,N_thread>>>(coll_GPU);
CUT_CHECK_ERROR("Esecuzione kernel fallita");

CUDA_SAFE_CALL(cudaMemcpy(coll2dev, coll_GPU, N_thread*sizeof(stato_struct),cudaMemcpyDeviceToHost));

free(coll2dev);
CUDA_SAFE_CALL(cudaFree(coll_GPU));

free(coll_CPU);

return 0;
}

Here is the DSMC_kernel_float.cu

// Kernel della DSMC
#include "DSMC_kernel_float.h"

__global__ void CollisioniGPU(stato_struct *coll_GPU){

coll_GPU[0].vAx=1;  
coll_GPU[1].vAy=1;
coll_GPU[2].vAz=1;
coll_GPU[3].tetaAp=1;
coll_GPU[4].phiAp=1;
}

Here is the DSMC_kernel_float.h

__global__ void CollisioniGPU(stato_struct* coll_GPU);

However when i type nvcc -I common/inc -rdc=true time_analysis.cu DSMC_kernel_float.cu in the terminal I get a weird message error and i don't understand why

DSMC_kernel_float.h(1): error: attribute "global" does not apply here

DSMC_kernel_float.h(1): error: incomplete type is not allowed

DSMC_kernel_float.h(1): error: identifier "stato_struct" is undefined

DSMC_kernel_float.h(1): error: identifier "coll_GPU" is undefined

DSMC_kernel_float.cu(4): error: variable "CollisioniGPU" has already been defined

DSMC_kernel_float.cu(4): error: attribute "global" does not apply here

DSMC_kernel_float.cu(4): error: incomplete type is not allowed

DSMC_kernel_float.cu(4): error: expected a ";"

At end of source: warning: parsing restarts here after previous syntax error

8 errors detected in the compilation of "/tmp/tmpxft_00003f1f_00000000-22_DSMC_kernel_float.cpp1.ii".

From what I read in the internet, I believe the error is cause by the struct but i don't understand how i could fix it to make the program work properly; how is possible that global does not apply here if i have other examples where it seems to be just fine?

Note: commom/inc is the folder provided by Nvidia in order to make Cuda compile correctly.

Ask yourself how the compiler would know the definition of `stato_struct` when it is compiling DSMC_kernel_float.cu? — talonmies, Dec 13 '14 at 08:54
do you mean i should use a header file where i define the struct? — Federico Gentile, Dec 13 '14 at 11:02

score 3 · Accepted Answer · edited May 23 '17 at 10:33

Regarding this statement:

Note: commom/inc is the folder provided by Nvidia in order to make Cuda compile correctly.

That's a mischaracterization. The referenced files (cutil.h and cutil_math.h) and macros (e.g. CUT_CHECK_ERROR) were provided in fairly old CUDA releases (prior to CUDA 5.0) as part of the cuda sample codes that were delivered at that time. They are not required "in order to make Cuda compile correctly." Furthermore, their use should be considered deprecated (refer to the CUDA 5.0 toolkit release notes). And if you are actually using an old toolkit like that, I would suggest upgrading to a newer one.

Regarding the compile issues, as @talonmies has pointed out, the compiler has no way of knowing what the definition of stato_struct is, when compiling any module that does not contain the definition (whether directly or included). This would be the case for your DSMC_kernel_float.cu module, which is where all your compile errors are coming from.

At first glance, it would seem that a sensible fix would be to move the typedef containing the stato_struct definition from your time_analysis.cu file into your header file (DSMC_kernel_float.h) and move the #include statement for that to the top of the time_analysis.cu file, along with your other includes.

However, it appears that your DSMC_kernel_analysis.cu file believes that there are a variety of members of that stato_struct:

__global__ void CollisioniGPU(stato_struct *coll_GPU){

coll_GPU[0].vAx=1;  
coll_GPU[1].vAy=1;
coll_GPU[2].vAz=1;
coll_GPU[3].tetaAp=1;
coll_GPU[4].phiAp=1;
}

which are not part of your current definition of stato_struct:

typedef struct{
int jp1;
int jp2;
float kx;
float ky;
float kz;

} stato_struct;

So this is confusing code, and I don't think anyone else can sort that out for you. You will either need two separate struct definitions, with separate names, or else you will need to modify your stato_struct definition to include those members (.vAx, .vAy, .vAz, .tetaAp, .phiAp).

The (mis)handling of this struct definition and the resultant errors have nothing to do with CUDA. This is arising out of the C/C++ language expectations.

Global function not recognized by CUDA C

1 Answers1