cuda 6 unified memory segmentation fault

Question

In order to use unified memory feature in CUDA 6, the following requirement must be meet,

a GPU with SM architecture 3.0 or higher (Kepler class or newer)
a 64-bit host application and operating system, except on Android
Linux or Windows

My setup is,

System: ubuntu 13.10 (64-bit)
GPU: GTX770
CUDA: 6.0
Driver Version: 331.49

The sample code are taken from the programming guide page 210.

 __device__ __managed__ int ret[1000];
 __global__ void AplusB(int a, int b) {
    ret[threadIdx.x] = a + b + threadIdx.x;
 }
 int main() {
   AplusB<<< 1, 1000 >>>(10, 100);
   cudaDeviceSynchronize();
   for(int i=0; i<1000; i++)
      printf("%d: A+B = %d\n", i, ret[i]);
 return 0;
 }

The nvcc compile option I used is,

nvcc -m64 -Xptxas=-Werror -arch=compute_30 -code=sm_30 -o UM UnifiedMem.cu

This code compiles perfectly fine. During execution, it produces "segmentation fault" at printf(). It feels like that unified memory feature didn't come into effect. The address of variable ret is still of GPU but printf is called on CPU. CPU is trying to access a piece of data that is not allocated on CPU so it produces a segmentation fault. Can anybody help me? What is wrong here?

I don't believe Ubuntu 13.10 is [listed as a supported OS for CUDA 6 RC](https://developer.nvidia.com/cuda-pre-production). Also, any time you're having trouble with a CUDA code, it's a good idea to add [proper cuda error checking](http://stackoverflow.com/questions/14038589/what-is-the-canonical-way-to-check-for-errors-using-the-cuda-runtime-api). The code is shown without error checking for clarity of communication, not as a demonstration of best practices. I had to add `#include ` to compile, but it ran fine on my cc3.5 device (GTX770 is cc3.5) on RHEL 6.2 and CUDA 6.0RC. — Robert Crovella, Mar 14 '14 at 04:45
Thanks for your advice. I did a small change to the program by using cudaMallocManaged() instead of fixed size ret[1000]. After I insert CUDA error check into each CUDA function call, the problem was found in the cudaMallocManaged(). It reports "operation not supported". Does this mean CUDA 6 does not support ubuntu? BTW GTX770 is a cc3.0 device. — user3418271, Mar 14 '14 at 13:45
Yes, my mistake, GTX 770 is cc 3.0. I already [linked the page](https://developer.nvidia.com/cuda-pre-production) (`<-click here`) that shows which OS's are supported. CUDA 6 supports Ubuntu 13.04 and 12.04. I don't know for sure that's the problem but you may have better luck with a supported OS. — Robert Crovella, Mar 14 '14 at 13:51

Grzegorz Szpetkowski · Answer 1 · 2014-03-14T19:30:58.463

Thought I am not certain sure (and I can't check it for myself right now) I think that because Ubuntu 13.10 has gcc in version of 4.8.1, which I believe is not supported yet even in newest CUDA Toolkit 6.0. Try to compile your code with host compiler gcc 4.7.3 (that is, the same one that is included in officially supported Ubuntu 13.04 for default). For that you might install gcc-4.7 package and point /usr/bin/gcc-4.7 as host compiler for nvcc. For C++ support I believe you need g++-4.7 as well.

If you need some simple step-by-step guide, then you might proceed with http://n00bsys0p.co.uk/blog/2014/01/23/nvidia-cuda-55ubuntu-1310-saucy-salamander. It's for CUDA Toolkit 5.5, but I think it should be relevant for recent version as well.

cuda 6 unified memory segmentation fault

1 Answers1