0

I have a iterative cuda program which iterates new values as required. It is a confidential code so I cant share, but I want to discuss the problem.

The iterative program runs properly on my PC when I work with less data. I have proper allocation and deallocation codes.

No matter how many times I run the program it runs properly with less data.

But in case of huge data, It runs properly one time but not multiple times providing an error "****.exe has stopped working.....". Same error persists until I restart the PC...each time. It is not feasible to restart the PC each time for me to start the program. So What might be the reason behind it?

pnuts
  • 58,317
  • 11
  • 87
  • 139
Roshan
  • 548
  • 1
  • 3
  • 11

1 Answers1

1

Most likely a memory error.

You should try running cuda-memcheck, this will make obvious any memory errors.

Other options include using error handling within your code, this would catch the problems as they arise.

Community
  • 1
  • 1
inJeans
  • 199
  • 1
  • 9
  • I am a little bit confused. You can't have `cuda-memcheck` _in_ your program. You can _run_ your program with `cuda-memcheck my_program`. Have you tried the error handling I suggested? You essentially need to put it around every cuda runtime api and kernel call in your code. It will definitely alert you to the root cause of the problem. – inJeans Sep 25 '15 at 10:22
  • I am sorry about the last comment, I have properly used cudaFreeHost, cudaFreee in my program. I make proper use of CudaSafeCall. Memory error is less likely. The problem as I said is, crashing for second run (not the 1st run) on large set of data.for small sets of data, it works fine. – Roshan Sep 29 '15 at 08:18
  • Sounds like a tricky one. I would guess it to be a pointer error. Unfortunately it is not really something you can get help for here. My only suggestion is a careful diagnosis of which size data set will cause the problem and some thorough testing of your implementation. Good luck. – inJeans Sep 29 '15 at 23:02