I am struggling with a problem that seems a little obscure.
I am working on a framework where a user can provide an implementation of an abstract base class which, after a few steps of magic and code generation, will be used inside a CUDA kernel.
I know that
"It is not allowed to pass as an argument to a global function an object of a class with virtual functions. "
because the vtable will be junk when created on the host and then copied to the GPU. But I do not pass the object to the kernel, I construct the object inside the kernel, which should not cause the vtable issue.
class VirtualBase {
public:
__device__ virtual int getResult() const = 0;
__device__ virtual ~VirtualBase();
};
class Implementation : public VirtualBase {
public:
__device__ Implementation(){};
__device__ int getResult() const { return 42; };
__device__ ~Implementation() {};
};
__global__ void kernel() {
Implementation impl;
int res = impl.getResult();
}
int main(void) {
kernel<<<1, 1>>>();
return 0;
}
The code is compiled with Nsights autogenerated makefile
/Developer/NVIDIA/CUDA-7.5/bin/nvcc -G -g -O0 -std=c++11 -gencode arch=compute_30,code=sm_30 -odir "src" -M -o "src/main.d" "../src/main.cu"
/Developer/NVIDIA/CUDA-7.5/bin/nvcc -G -g -O0 -std=c++11 --compile --relocatable-device-code=false -gencode arch=compute_30,code=compute_30 -gencode arch=compute_30,code=sm_30 -x cu -o "src/main.o" "../src/main.cu"
which results in the error
ptxas fatal : Unresolved extern function '_ZN11VirtualBaseD2Ev'
make: *** [src/main.o] Error 255
I am on a Mac with CUDA 7.5 installed but I tried the same thing on a machine with Ubuntu 14.10 and CUDA 7.0, yielding the same results.