I have never used CUDA or C++ before, but I am trying to get Ramses GPU from http://www.maisondelasimulation.fr/projects/RAMSES-GPU/html/download.html running.
Due to an error in the autogen.sh I used ./configure and got this one working.
So the makefile produced contains the following NVCC flags
NVCCFLAGS = -gencode=arch=compute_10,code=sm_10 -gencode=arch=compute_11,code=sm_11 -gencode=arch=compute_13,code=sm_13 -gencode=arch=compute_20,code=sm_20 -gencode=arch=compute_20,code=compute_20 -use_fast_math -O3
But when I try to compile the program using make, I get multiple ptxas Errors:
Entry function '_Z30kernel_viscosity_forces_3d_oldPfS_S_S_iiiiiffff' uses too much shared data (0x70d0 bytes + 0x10 bytes system, 0x4000 max)
Entry function '_Z26kernel_viscosity_forces_3dPfS_S_S_iiiiiffff' uses too much shared data (0x70d0 bytes + 0x10 bytes system, 0x4000 max)
Entry function '_Z32kernel_viscosity_forces_3d_zslabPfS_S_S_iiiiiffff9ZslabInfo' uses too much shared data (0x70e0 bytes + 0x10 bytes system, 0x4000 max)
I'm trying to compile this code on Linux with Kernel 2.6 and CUDA 4.2 (I try to do it in my university and they are not upgrading stuff regularly.) on two NVIDIDA C1060. I tried replacing the sm_10
, sm_11
and sm_13
by sm_20
, (I saw this fix here: Entry function uses too much shared data (0x8020 bytes + 0x10 bytes system, 0x4000 max) - CUDA error) but that didn't fix my problem.
Do you have any suggestions? I can upload the Makefile
as well as everything else, if you need it.
Thank you for your help!