I am having problems using cudaMemcpyToSymbol. I have a code that works just fine. A cutdown version of my code is this:
mykernel.h file:
__global__
void foo(float* out);
mykernel.cu file:
#include "kernels.h"
__global__
void foo(float* out)
{
uint32_t idx = blockIdx.x * blockDim.x + threadIdx.x;
out[idx] = 10;
}
main.cu file:
#include "kernels.h"
main()
{
// initialization and declaration stuff here
foo<<<1,1,1>>>(my_global_memory);
// read back global memory and investigate values
}
The above code works just perfect. Now I want to replace this "10" value with a value coming from a constant memory. So what I did was to:
- add
__constant__ float my_const_var;
in mykernel.h file. - replace the last line of my kernel with
out[idx] = my_const_var;
in mykenel.cu - add
float value = 10.0f; cudaMemcpyToSymbol(my_const_var,&value);
before my invocation in main.cu
After having done all that it looks like cudaMemcpyToSymbol doesn't copy the actual value because I get a result of '0' instead of '10'. In addition, I always check for CUDA errors and there is none. Can someone tell me what am I doing wrong? And why cudaMemcpyToSymbol does not copy the value to the symbol? I am using a GeForce9600M (compute capability 1.1) with latest drivers on Debian Linux and CUDA SDK 5.0. I also tried running cuda-memcheck and I get no errors.