This is my scenario. I program my CUDA application on windows machine. I compile and run this application on remote linux (Debian) server (without graphical output) using putty.
I want to ask what is the best way to debug and profile my application. I read something about Nvidia product Parallel Nsight and Parallel Nsight Monitor. Is this the (only) way?