6

I am doing some programming with cuda. I screw up with the GPU memory somehow and the following is what I see on my screen, which is driving me crazy!! Have anybody ever came across a similar problem before. Is there a way to fix the problem other than restarting the computer?

As I am debugging, I don't want to restart my computer ever single time I launch the program. I will appreciate whatever advice you can provide.

By the way, the black and white dots are flashing like stars! And that's making me very dizzy!!

enter image description here

Yuchen
  • 30,852
  • 26
  • 164
  • 234
  • 3
    I've never tried this, but since you're on windows, if you haven't disabled the watchdog timeout mechanism, you should be able to write a CUDA program that spins forever (e.g. a while loop that never exits). Compile it to an executable. Make a shortcut to that executable on your desktop. Whenever your display gets corrupted like this, try running that executable. It should cause the display to freeze, which will trigger the Windows TDR mechanism, which will cause a GPU reset and driver reload. Like I said, I've never tried it. I don't know of a way to reset the GPU otherwise, in windows. – Robert Crovella Feb 24 '14 at 23:14
  • 2
    http://stackoverflow.com/questions/10871412/resetting-gpu-and-driver-after-cuda-error – Roger Dahl Feb 25 '14 at 00:07
  • 1
    Which GPU are you using? I think this issue might only appear in GPUs of compute capability < 2.0. – Roger Dahl Feb 25 '14 at 00:13
  • Try to download a latest GPU driver, the one that is meant for CUDA programming module (Nvidia has several types of drivers for each card) – TripleS Feb 25 '14 at 12:22
  • @RogerDahl, I am using NVIDIA Quadro FX 4800. – Yuchen Feb 25 '14 at 15:57
  • 1
    @YuchenZhong, the Quadro FX 4800 is a compute capability 1.3 device so you might want to try upgrading to >= 2.0 to resolve this. – Roger Dahl Feb 25 '14 at 16:38
  • @RogerDahl, I already did. I had a bug of 'stackoverflow' in my cuda code. I find that bug and everything is fine now. Thank you so much for your notes :) – Yuchen Feb 25 '14 at 17:21

2 Answers2

4

In general, under windows, there is no mechanism for ordinary user access to reset or restart of the GPU.

However if you have not modified the windows vista/7/8 TDR mechanism on your machine, you may be able to take advantage of it in this case to force a GPU reset by the OS.

You should be able to write a CUDA program that spins forever (e.g. a while loop that never exits). Compile it to an executable. Make a shortcut to that executable on your desktop. Whenever your display gets corrupted like this, try running that executable. It should cause the display to freeze, which will trigger the Windows TDR mechanism, which will cause a GPU reset and driver reload.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
1

I had a similar problem under Linux today. As I couldn't find a way to do it properly without terminating my current graphical session, I just put my computer to sleep and restarted. It worked, and should probably work the same regardless of the operating system.

MayeulC
  • 1,628
  • 17
  • 24