-2

I'm very new to c++, and even more so cuda. So I apologize if this question has been obviously answered somewhere else. I searched through the answer base the best I could, but the closest answer I could find to my question was this one.

However this answer deals with passing a 2d array into cuda memory. Which is more complicated than what I'm trying to do (I think).

I know that in order to pass "standard" arrays into a cuda kernel you can do this:

int array[size];
int *pointer;

cudaMalloc((void**) &pointer, size*sizeof(int)); 
cudaMemcpy(pointer, array, size*sizeof(int), cudaMemcpyHostToDevice);

Then in my kernel I receive it like this:

__global__ void kernel(int *array){
  int bid = blockIdx.x;
  array[i] = whatever; // Fill the array
}

However I ran into a problem only using the code above. I need the int array to be 1920*1080*4 bytes long (image processing stuff). However when I make the array this size using the code above the program crashes.

I found out from this answer it is because I exceeded my stack size. So I learned to allocate space for the array like this:

int *differenceArray = (int*)malloc(sizeof(int)*1280*720);

But now I am confused as how to pass that into a cuda kernel. If I try:

CUDA_CALL(cudaMalloc((void**) &differenceArray, 1280*720*sizeof(int)));
CUDA_CALL(cudaMemcpy(differenceArray, 1280 * 720*sizeof(int), cudaMemcpyHostToDevice));

I get this error:

error : argument of type "unsigned int" is incompatible with parameter of type "const void *"

Any help would be much appreciated! Thank you!

talonmies
  • 70,661
  • 34
  • 192
  • 269
YAHsaves
  • 1,697
  • 12
  • 33

1 Answers1

2

First of all study how memcpy works. You use cudaMemcpy in a conceptually similar fashion. The first 3 parameters are basically identical.

You ran into stack trouble here:

int array[size];

So the right thing to do was not this:

int *differenceArray = (int*)malloc(sizeof(int)*1280*720);

but this:

int *array = (int*)malloc(sizeof(int)*1280*720);

(and of course delete the previous definition of array).

With that change the cudaMemcpy operation looks like this:

int *differenceArray;
CUDA_CALL(cudaMalloc((void**) &differenceArray, 1280*720*sizeof(int)));
CUDA_CALL(cudaMemcpy(differenceArray, array, 1280 * 720*sizeof(int), cudaMemcpyHostToDevice));
         //          (dev ptr)  <--- (host ptr)
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Wow you are indeed correct. I watched a 5 min video on youtube about how memcpy works and everything makes much more sense. I appreciate your help thank you! – YAHsaves Jan 10 '18 at 03:45
  • @YAHsaves can you please tale a look https://stackoverflow.com/questions/72065248/can-i-send-an-array-through-cuda-kernel-launch?noredirect=1#comment127334840_72065248. This is similar to your question here. But you fille the array in the kernel. However, I want to send an array with elements to the kernel. – Encipher Apr 30 '22 at 03:34