Recently I started to build the application which uses CUDA 8.0 on Visual Studio 2015. Because I have to use Dynamic Parallelism I had to change Code Generation into compute_35, sm_35 from compute_20, sm_20 (defualt). Since I have changed it, invoked printf() inside a Kernel does not print anything. Do you know the way that I can use Dynamic Parallelism and print something from inside the Kernel?
Perhaps it is worth mentioning that my graphic card is GeForce GTX 760.