cudaMemset fails on device variable

Question

I am having trouble using cudaMemset on a device variable. Is it possible to use the reference to the device variable for cudaMemset, or is it just a matter of missing compiler flags, or libraries.. I am using cuda 4.1, and

NVRM version: NVIDIA UNIX x86_64 Kernel Module 285.05.33 Thu Jan 19 14:07:02 PST 2012

This is my sample code:

#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>

// device variable and kernel
__device__ float d_test;

int main() {

  if (cudaMemset(&d_test,0,sizeof(float)) !=cudaSuccess)
        printf("Error!\n");
}

which outputs:

Error!

cudaGetSymbolAddress doesn't work for me. Do I need to add some compiler flag? I copy the code above but it says GPUassert: invalid device symbol XXXX.cu 24 — worldterminator, Aug 29 '12 at 16:48

talonmies · Accepted Answer · 2018-06-21T09:01:06.277

Your problem is that d_test (as it appears in the host symbol table) isn't a valid device address and the runtime cannot access it directly. The solution is to use the cudaGetSymbolAddress API function to read the address of the device symbol from the context at runtime. Here is a slightly expanded version of your demonstration case which should work correctly:

#include <stdio.h>
#include <stdlib.h>
#include <cuda_runtime.h>

// device variable and kernel
__device__ float d_test;

inline void gpuAssert(cudaError_t code, char * file, int line, bool Abort=true)
{
    if (code != cudaSuccess) {
        fprintf(stderr, "GPUassert: %s %s %d\n", cudaGetErrorString(code),file,line);
        if (Abort) exit(code);
    }       
}

#define gpuErrchk(ans) { gpuAssert((ans), __FILE__, __LINE__); }

int main()
{

    float * _d_test;

    gpuErrchk( cudaFree(0) );
    gpuErrchk( cudaGetSymbolAddress((void **)&_d_test, "d_test") );
    gpuErrchk( cudaMemset(_d_test,0,sizeof(float)) );

    gpuErrchk( cudaThreadExit() );

    return 0;
}

Here, we read the address of the device symbol d_test from the context into a host pointer _d_test. This can then be passed to host side API functions like cudaMemset, cudaMemcpy, etc.

Edit to note that the form of cudaGetSymbolAddress shown in this answer has been deprecated and removed from the CUDA runtime API. For modern CUDA, the call would be:

gpuErrchk( cudaGetSymbolAddress((void **)&_d_test, d_test) );

Thanks! this works.. good to know about the 'cudaGetSymbolAddress' function to extract symbol address from the context. — nganesan, Apr 17 '12 at 02:00

score 0 · Answer 2 · edited May 23 '17 at 11:51

0

I believe you can also use cudaMemcpyFromSymbol: A function, such as the following kernel, can change the value of the variable declared in global memory (outside of the main function)

__global__ void kernel1() { d_test = 1.0; }

Inside your main, you can obtain the value using cudaMemcpyFromSymbol

cudaMemcpyFromSymbol(&h_test,"d_test",sizeof(float),0,cudaMemcpyDeviceToHost);

Of course, there is also cudaMemcpyToSymbol to change the value of the global variable.

The idea came from here: Having problem assigning a device variable in CUDA

edited May 23 '17 at 11:51

Community

1
1

answered Oct 24 '12 at 04:10

lucky85dog

141
1
5

1

Note that the form of `cudaMemcpyFromSymbol` call shown in this answer is long deprecated and isn't supported in either CUDA 5 or CUDA 6 – talonmies May 18 '14 at 13:17

cudaMemset fails on device variable

2 Answers2

Linked

cudaMemset fails on __device__ variable

2 Answers2

Linked

cudaMemset fails on device variable