0

I'm trying to copy 2D array from CPU to GPU.From host side i'm sending base pointer of 2D array,P is number of elements in one dimension

 int *d_a;

 cudaMalloc(d_a,P*P*sizeof(int));

 copyKernelHostToDevice((int(*)[P])d_a,(int(*)[P])hAligned_a);

 copyKernelHostToDevice((int(*)[P])d_b,(int(*)[P])hAligned_b);


 inline void copyKernelHostToDevice(int (*A)[P],int (*B)[P]){

      for(int i=0;i<P;i++)
      cutilSafeCall(cudaMemcpyAsync(A[i],B[i],P*sizeof(int),cudaMemcpyHostToDevice));

}

but above code is giving me runtime error

cudaSafeCall() Runtime API error 11: invalid argument.

Am I missing something? P is significantly large...arnd 2048

username_4567
  • 4,737
  • 12
  • 56
  • 92
  • If you are getting an invalid argument error, it probably means that B[i] is not a valid device pointer. Can you edit you question to explain where B is allocated, and what CUDA version you are using? – talonmies Mar 28 '12 at 09:59
  • Which is the host pointer array and device pointer array among A and B? Have you allocated device memories for all the P pointers in the device pointer array using cudaMalloc? – Ashwin Nanjappa Mar 28 '12 at 10:03
  • I've addded host side code above,d_a is device pointer.So basically i'm allocating 1D array on GPU and using it as 2D array by typecasting it – username_4567 Mar 28 '12 at 11:06
  • @user997704: that isn't very helpful. Can you show where the pointers are defined and allocated? – talonmies Mar 28 '12 at 14:15
  • @user997704: Your `cudaMalloc` call is probably wrong. Can you confirm that is the code you are really using? – talonmies Mar 29 '12 at 14:45

1 Answers1

1

It looks like d_a isn't a valid device pointer, because your cudaMalloc call looks to be incorrect. It should be something like this:

int *d_a;
cudaMalloc((void **)&d_a,P*P*sizeof(int));
talonmies
  • 70,661
  • 34
  • 192
  • 269
  • 1
    @user997704: Trust me, it does make a very big difference. – talonmies Mar 31 '12 at 08:26
  • Can you please elaborate it?I'm curious to know...may some example or theory will do... – username_4567 Mar 31 '12 at 08:28
  • [This answer to a C programming question](http://stackoverflow.com/questions/2838038/c-programming-malloc-inside-another-function) explains what is wrong with your call to cudaMalloc (it is equivalent to calling malloc inside another function). – harrism Sep 17 '12 at 01:02