7

Recently, I am regularly encountering errors of type

"An unhandled exception of type 'System.StackOverflowException' occurred in Unknown Module.".

This happens in a game (that I developed) with a quite large code base (C# / XNA). But typically the error occurs only after several minutes of gameplay (and not in every run).

The problem is that unfortunately, the Visual Studio debugger seems to not be able to further localize the problem and just lets me inspect the assembler code without reference to my source lines. How could one debug such an error? I guess tools like Valgrind are not available in C#. Is there maybe a better debugger that can show me where the problem is localized in the source code?

The call stack available when applying the steps in the suggested answer below. It is:

ntdll.dll!_NtWaitForSingleObject@12()  + 0x15 bytes 
ntdll.dll!_NtWaitForSingleObject@12()  + 0x15 bytes 
KernelBase.dll!_WaitForSingleObjectEx@12()  + 0xcb bytes    
kernel32.dll!_WaitForSingleObjectExImplementation@12()  + 0x43 bytes    
clr.dll!CLREvent::CreateManualEvent()  - 0x15f3bb bytes 
clr.dll!CLREvent::CreateManualEvent()  - 0x15f37a bytes 
clr.dll!CLREvent::WaitEx()  + 0x47 bytes    
clr.dll!CLREvent::Wait()  + 0x19 bytes  
clr.dll!Thread::WaitSuspendEventsHelper()  + 0xa8 bytes 
clr.dll!Thread::WaitSuspendEvents()  + 0x17 bytes   
clr.dll!Thread::RareEnablePreemptiveGC()  + 0x181977 bytes  
clr.dll!Thread::RareDisablePreemptiveGC()  + 0x38e3 bytes   
clr.dll!Debugger::SendException()  + 0x12b bytes    
clr.dll!Debugger::LastChanceManagedException()  + 0x19f bytes   
clr.dll!NotifyDebuggerLastChance()  + 0x79 bytes    
clr.dll!WatsonLastChance()  + 0x166 bytes   
clr.dll!EEPolicy::HandleFatalStackOverflow()  + 0x189 bytes 
clr.dll!EEPolicy::HandleStackOverflow()  + 0xd8 bytes   
clr.dll!_COMPlusFrameHandler()  + 0xff302 bytes 
ntdll.dll!ExecuteHandler2@20()  + 0x26 bytes    
ntdll.dll!ExecuteHandler@20()  + 0x24 bytes 
ntdll.dll!_RtlDispatchException@8()  + 0xd3 bytes   
ntdll.dll!_KiUserExceptionDispatcher@8()  + 0xf bytes   
clr.dll!SystemNative::ArrayCopy()  + 0x19 bytes 
mscorlib.ni.dll!6ed326a2()  
Frames below may be incorrect and/or missing, no symbols loaded for mscorlib.ni.dll 
ares_games
  • 1,019
  • 2
  • 15
  • 32
  • 5
    Do you not even get a stack trace? – Jon Skeet Jan 07 '13 at 13:30
  • 4
    If you have source code access, attach the debugger and run it locally, it will take you to the line which threw the Exception – prthrokz Jan 07 '13 at 13:31
  • 4
    Stackoverflow **is not a forum** – Soner Gönül Jan 07 '13 at 13:31
  • 3
    @trippino "localised" and "localized" are both perfectly valid spellings. That part of your edit was bogus. It would be equally bogus for me to change it back, though. –  Jan 07 '13 at 13:43
  • I have source code access but attaching the debugger (F5 in Visual studio) is not getting me the line which threw the Exception. It just shows the disassembly. – ares_games Jan 07 '13 at 13:44
  • 3
    Use the code built in Debug rather then in Release because the Release build configuration might not export debug symbols and will not enable you to see the code that generated the overflow. Plus, you might not be seeing where the exception came from because it came from code that you reference, from external assemblies, you should be able to see that though in the call stack. – dutzu Jan 07 '13 at 14:00
  • I am already using the Debug version. Regarding external assemblies, the Exception shows "Call stack location: "ntdll.dll!76edf8b1()". However, I doubt that the problem really is ntdll.dll. – ares_games Jan 07 '13 at 14:11
  • 2
    Can you try logging states of every suspicious object and see if after several *successful fails* some values are about the same, maybe you can find the reason that way. Also try compiling your game for Reach profile (or HiDef, other than you have now). – user1306322 Jan 07 '13 at 15:09
  • Unfortunately, I cannot identify any suspicious objects without the help of a debugger. The code base is quite huge (>350 *.cs files) and I have no clue what or where somthing could have gone wrong. Compiling for the Reach-Profile is compicated as I use a lot of textures that are larger that 2048² (which is forbidden in Reach). – ares_games Jan 07 '13 at 15:27
  • 3
    Can you create a crash dump of the application when the exception is thrown and then view the call stack (.NET metadata and all) with WinDbg? I've seen SO exceptions return the wrong call stack via error reporting, and you need to switch thread contexts to see the correct stack. – Christopher Currens Jan 07 '13 at 16:54
  • 1
    Maybe using lots of huge textures makes you run out of memory and somehow shows this error message? Still, try Reach profile, it doesn't always freak out when using big textures. – user1306322 Jan 07 '13 at 17:28
  • I am just installing WinDbg and I will try that. Regarding the use of larger textures: the application is using a quite large amount of main memory (~650MB) but if this would be too much, the error should occur earlier and more reproducible. 99% of all used memory is allocated statically in an early game initialiation phase. Is there anyway to obtain the "remaining available stack memory" in C#? Regarding using the Reach profile in XNA, I am not sure what you mean with "freak out". The compiler very calmly ;-) just gives you an error when you use only one texture larger than 2048². – ares_games Jan 07 '13 at 18:06
  • Just to keep you updated, I luckily found a backup of my code that was only 7 days old where it seems the error did not yet occur. The last few hours I reversed every larger change that I made during the last week until I was able to identify the point where something went wrong. I am happy to report that I identified the method that causes the error (out-commenting fixes it). However, it is a long and complex one and I am am quite tired now. I am very optimistic, however, that tomorrow I will get to the bottom of this. Stay tuned. – ares_games Jan 08 '13 at 01:39
  • By the way, I was not successful in getting WinDbg to work. I was not able to porperly load the symbols for multiple Windows dlls. However, I did not invest too much time in this, yet. – ares_games Jan 08 '13 at 01:42
  • @JonSkeet I thought you can't get a stacktrace with StackOverflowExceptions? As I understand it (and have experienced), the applications is just killed and you see an entry in the Windows Event Log. – Peter Sep 11 '17 at 11:56
  • @Peter: Yup, you're right, I'm not sure what I was thinking when I wrote that comment. (It's hard to remember over 4 years ago, apart from anything else...) – Jon Skeet Sep 11 '17 at 12:11
  • @JonSkeet Okay thanks. Just wanted to make sure I wasn't missing anything. – Peter Sep 11 '17 at 12:35

2 Answers2

4

If the crash is happening with ntdll.dll, you'd need the symbols for it, but I think the more likely possibility is that you are passing some weird junk in that is causing it to crash. Are you making an Windows API calls that might be crashing it?

Another possibility, which was mentioned by another user here is that you could be making a recursive call somewhere that is running out the stack. This would be especially problematic if the calls are being made to unmanaged pieces of code:

  • Are there any logic conditions that might cause an infinite loop?
  • Are there any constructors that make unintentional recursive calls?
  • Are there any recursive methods in your code that might be stuck?

Also, a couple of things you may want to try before you go down the road for looking for an alternative way to debug:

  1. Ensure that the project is built in debug
  2. Check Visual Studio settings to be sure that it is halting on all exceptions
  3. Turn off the "just my code" setting if it is available in your project settings (does this even show up in C# projects?)
  4. Switch on mixed mode debug/unmanaged debugging
  5. Ensure that symbols are being generated and stored in the correct location (*.pdb)
  6. Failing all of that, you can poke around in the System event viewer and look for any weird errors
Ray
  • 1,422
  • 2
  • 21
  • 39
  • Thank you for these very useful hints. I will try these soon and report back. – ares_games Jan 08 '13 at 01:40
  • After applying all the suggested settings, I still can't get specific information what causes the problem in my code (I will get to that soon) but I get a decent call stack (see above in my question). – ares_games Jan 08 '13 at 14:29
  • 1
    I could identify the problem which was an unintensional recursive call causing an infinite loop in a rarely entered branch of the code. I was able to identify the errorous part of the code ONLY by tedious "manual debugging" (commenting out different parts of the code and thereby narrowing down the problem). Interestingly, even after applying all the suggested steps (loading all symbols etc.) the Visual Studio debugger was not able to break at the errorous part of my source code and step into it. – ares_games Jan 10 '13 at 00:34
3

StackOverflowException is usually caused by some method which invokes itself endlessly.

The fact that it occurs after some time makes me even more adamant about it: you're facing infinite recursion.

An extremely simple example of this behavior would be:

void SomeMethod()
{
    SomeMethod(); // StackOverflowException
}
Alex
  • 23,004
  • 4
  • 39
  • 73
  • Thank you for pointing this out. However, an infinite recursion would lead to reproducible and consistent errors. However, I can do test runs of the game that run for hours without problems. Others crash after minutes (with everyone in the game doing exactly the same). Therefore I guess the problem is more tricky. But I am on it and I will post the results, soon. – ares_games Jan 08 '13 at 16:27
  • 2
    This is not nessessarily the case; You could enter into an infinite recursion only when you meet a particular logic condition, meaning that you only hit the condition "sometimes", explaining the unpredictable behaviour. Having said that, it sounds like it might be something else anyways... – Ray Jan 08 '13 at 18:41