-1

I have a problem in my code, more precisely both cudaMemcpy calls return cudaErrorInvalidValue in the following code:.

const int N = 3 ;
double d_doc[N][4][2500];
double d_vec_res[N][2];
double d_req[4][100];.
.
.
.    
void similarity (double doc[][4][2500], double req[4][100], double vec_res[][2]) 
    {         
        int r =  cudaMalloc((void **)&d_req , 4*100*sizeof(double) ); // r = cudaSuccess
        cudaMalloc((void **)&d_doc , N*2500*4*sizeof(double) );
        cudaMalloc((void **)&d_vec_res , N*2*sizeof(double) );         
        int err =  cudaMemcpy(d_req, req, 4*100*sizeof(double), cudaMemcpyHostToDevice); // err = cudaErrorInvalidValue
        int err2 = cudaMemcpy(d_doc, doc, N*4*2500*sizeof(double), cudaMemcpyHostToDevice); // err = cudaErrorInvalidValue               
        sim<<<1, N>>>(d_doc,d_req,d_vec_res);        
        cudaMemcpy(vec_res, d_vec_res, 2*N*sizeof(double),cudaMemcpyDeviceToHost); 
         cudaFree(d_req);
         cudaFree(d_doc);
         cudaFree(d_vec_res);
     }

Could you help me please?.

Fate7
  • 11
  • 1
  • 2
  • there are two cudaMemcpy calls in that code. Should we guess which one is reporting the error? – talonmies Feb 17 '16 at 12:36
  • Both of them ! thank you for your comment – Fate7 Feb 17 '16 at 12:37
  • 1
    Please extend you code to a [mcve], in your case the focus should be on _complete_. It would be helpful if you would use [error checking](http://stackoverflow.com/a/14038590/5085250) on the relevant part of your question, i.e. the `memcpy`. This makes it easy to verify the behaviour. – havogt Feb 17 '16 at 13:02
  • @havogt thank you for your suggestions – Fate7 Feb 17 '16 at 13:31
  • 2
    @Fate7: Your edit to the code neither provides compileable code, nor makes any sense. If `d_doc` is a statically defined array, why are you passing it to cudaMalloc? – talonmies Feb 17 '16 at 13:46
  • 1
    Your code is clearly broken. As @talonmies has pointed out, the pointers for `d_doc`, `d_vec_res` and `d_req` are not correctly defined. Have you looked at any basic sample codes to see how this is done? Take a look at the vectorAdd sample code. And a solution probably can't be given or discussed, without knowing how you intend to use those data structures (arrays) in device code. The solution might be as simple as defining `double *d_doc, *d_vec_res, *d_req;` instead. But if you are attempting multiple-subscripted access in your `sim` kernel, that won't work either. – Robert Crovella Feb 17 '16 at 14:20

1 Answers1

1

Just declare the arrays as pointers

double *d_doc, *d_vec_res, *d_req;

and in your kernel, access those arrays as linear array, not as multidimensional arrays.

talonmies
  • 70,661
  • 34
  • 192
  • 269
Sullivan Risk
  • 319
  • 1
  • 4
  • 21