This code doesn't work in the same way when compiled with different compute capabilities:
#include <cuda.h>
#include <stdio.h>
__managed__ int m;
int main() {
printf("hi 1\n");
m = -123;
printf("hi 2\n");
}
Device with compute capability 6.0:
$ nvcc main.cu -gencode arch=compute_60,code=sm_60 -rdc=true && ./a.out
hi 1
hi 2
Device with compute capability 7.0:
$ nvcc main.cu -gencode arch=compute_60,code=sm_60 -rdc=true && ./a.out
hi 1
Segmentation fault
Device with compute capability 7.0:
$ nvcc main.cu -gencode arch=compute_70,code=sm_70 -rdc=true && ./a.out
hi 1
hi 2
Why I have Segmentation fault when building with compute capability 6.0 and run it on GPU with compute capability 7.0?