2

I have device variable and in this variable, I allocate and fill an array in the device, but I have a problem to get data to host. cudaMemcpy() return cudaErrorInvalidValue error. how can I do it?

PS: The Code is just example, I know, that In this particular case I can use cudaMalloc because I know the size of the array, but In my REAL code, It computes the size of the array in the device and it needs immediately allocate memory.

PS2: I found a similar problem, but I still don't know, how can I solve it? - copy data which is allocated in device from device to host

PS3: I have updated code, but still doesn't work:{

PS4: I am just trying to run this code on a notebook with Nvidia GT 520MX(latest game driver) and doesn't work too :(

thx

#include <cuda.h>
#include <stdio.h>

#define N 400
__device__ int* d_array;

__global__ void allocDeviceMemory()
{
    d_array = new int[N];
    for(int i=0; i < N; i++)
         d_array[i] = 123;
}

int main()
{
    allocDeviceMemory<<<1, 1>>>();

    cudaDeviceSynchronize();

    int* d_a = NULL;
    cudaMemcpyFromSymbol((void**)&d_a, "d_array", sizeof(d_a), 0, cudaMemcpyDeviceToHost);
    printf("gpu adress: %lld\n", d_a);


    int* h_array = (int*)malloc(N*sizeof(int));
    cudaError_t errr = cudaMemcpy(h_array, d_a, N*sizeof(int), cudaMemcpyDeviceToHost);
    printf("h_array: %d, %d\n", h_array[0], errr);

    getchar();
    return 0;
}
SRhm
  • 459
  • 1
  • 5
  • 11
Milan
  • 91
  • 1
  • 1
  • 7

2 Answers2

1

You need to synchronize (cudaDeviceSynchronize()) after launching the kernel to allocate the memory.

Can you also check the return value of the sync and all other CUDA API calls?

Tom
  • 20,852
  • 4
  • 42
  • 54
1

i have tested your code and there is no error here. I am running CUDA 4.0.

talonmies
  • 70,661
  • 34
  • 192
  • 269
brano
  • 2,822
  • 19
  • 15
  • what??? I have CUDA 4.1, developer driver, Win7 x64, VS 2008 and I have GTS450. Can someone else test the code? – Milan Feb 06 '12 at 10:24
  • do you add some extra compiler parameters? I just set gpu architecture to "sm_21" and x64 target machine platform. And I am still gettting same error. This is weird! – Milan Feb 06 '12 at 11:21
  • I am using Win7 x64, VS 2010. Running on GTX580. GPU architecture to sm_20 and x64 target machine. When i run it it prints the correct value: h_array: 123, 0. – brano Feb 06 '12 at 11:49
  • here is the link on my exe file, can you test it for me, I think that problem can be in my HW or drivers: http://www.2shared.com/file/1jpULcdP/CUDA.html – Milan Feb 06 '12 at 12:39
  • Sorry but i will not be able to run it, because i am still on CUDA 4.0. I don't have the correct dll files. If you upload the correct dll files as well i could try it. One of the dll files is cudart64_41_28.dll – brano Feb 06 '12 at 12:41
  • here is dll(hope thats all): http://www.2shared.com/file/zLF7Sw2Q/cudart64_41_28.html – Milan Feb 06 '12 at 12:49
  • Hi, i was able to run it this time, but the results are strange. Everything is 0 and the last two lines are "gpu_adress:0" and "h_array:6128016, 35". It could be that I am using the driver recommended for CUDA 4.0. – brano Feb 06 '12 at 12:54
  • thx. I think that this is no good way, I try to install last driver - now I am using developer drivers which is recommend for 4.1, but maybe I will have luck. – Milan Feb 06 '12 at 12:58