2

Our C# application calls MinidumpWriteDump upon an unhandled exception.

I have received some crash dumps from users where i cannot seem to open the crash dump with SOS and see the exception that caused the crash.

The dump type we're taking is MiniDumpWithPrivateReadWriteMemory

I have my _NT_SYMBOL_PATH configured to use MS public symbol server, and when debugging this crash dump in WinDBG, it automatically downloads the needed DLLs (as this dump was taken on a machine with a different version of .NET 2 - namely the one that ends with .3053)

When running !Threads i am getting this output:

Failed to request ThreadStore

I have gone over ALL POSSIBLE sites that explain techniques for handling different versions of the CLR than the one taken in the dump machine, none worked for me.

What can i do in order to debug these crashes?

Are we doing something wrong (taking the wrong kind of dump from the .NET process, etc)

EDIT:

Here's the result of ~*:

0:000> ~* . 0 Id: 1338.258 Suspend: 0 Teb: 7ffdf000 Unfrozen Priority: 0 1 Id: 1338.2a0 Suspend: 0 Teb: 7ffde000 Unfrozen Priority: 0 2 Id: 1338.1fd4 Suspend: 0 Teb: 7ffdd000 Unfrozen Priority: 0 3 Id: 1338.17e8 Suspend: 0 Teb: 7ffda000 Unfrozen Priority: 0 4 Id: 1338.1148 Suspend: 0 Teb: 7ffd9000 Unfrozen Priority: 0 5 Id: 1338.b1c Suspend: 0 Teb: 7ffd7000 Unfrozen Priority: 0 6 Id: 1338.f94 Suspend: 0 Teb: 7ffd4000 Unfrozen Priority: 0 7 Id: 1338.11b4 Suspend: 0 Teb: 7ff4f000 Unfrozen Priority: 0 8 Id: 1338.1814 Suspend: 0 Teb: 7ff4e000 Unfrozen Priority: 0 9 Id: 1338.1cc4 Suspend: 0 Teb: 7ffdb000 Unfrozen Priority: 0 10 Id: 1338.1e48 Suspend: 0 Teb: 7ffd5000 Unfrozen Priority: 0 11 Id: 1338.1a5c Suspend: 0 Teb: 7ff4c000 Unfrozen Priority: 0 12 Id: 1338.1874 Suspend: 0 Teb: 7ff4b000 Unfrozen Priority: 0 13 Id: 1338.1498 Suspend: 0 Teb: 7ff4a000 Unfrozen Priority: 0

Here's the result of !analyze -v:

analyze

lysergic-acid
  • 19,570
  • 21
  • 109
  • 218
  • What happens if you open the crash dump file in Visual Studio? – Philipp Schmid Aug 09 '11 at 14:39
  • Our application uses .NET 3.5 and VS2008, thus cannot be opened this way (only starting .NET 4 and VS2010 as far as i know). – lysergic-acid Aug 09 '11 at 14:40
  • Don't know if that is your problem, but generally creating a mini dump from the crashing app itself is [unreliable](http://msdn.microsoft.com/en-us/library/ms680360%28VS.85%29.aspx) (see Remarks section). – Christian.K Aug 09 '11 at 17:17
  • That works fine, only these specific crashes that were taken on another OS/.NET version seem to give a hard time. What are the alternatives for taking it from the same process? – lysergic-acid Aug 10 '11 at 05:16
  • @Christian Taking a process dump from the same process can cause a deadlock, but generally speaking if you managed to create the dump then you have to assume that its probably OK. – Justin Aug 10 '11 at 14:30
  • @liortal The alternative is to spawn a separate "watcher" process (as the process starts - well before the exception happens) to monitor the process and create the process dump. Its quite a lot of effort and unless there is a requirement for high availability I'd probably just accept that your process may deadlock on an unhandled exception. – Justin Aug 10 '11 at 14:33
  • @Kragen yes, that is what the msdn page says, and I can't say that I have too much experience here. What I know from UNIX is, that (say) once your heap is corrupted, all bets are off. There is not a whole lot you can reliably do anymore. But a native issue doesn't seem to be the issue here, so it may indeed be ok. – Christian.K Aug 10 '11 at 18:26
  • I am not sure whether a process crashing causes the heap to be corrupted (unless of course the crash occured due to a corruped heap, a scenario that should be normally circumvented by the CLR itself). The UnhandledException event exists in order to execute some code before crashing. I find it hard to belive that they would allow your code to execute while the heap is corrupted or some other issue that can prevent taking a dump file. – lysergic-acid Aug 10 '11 at 18:34

2 Answers2

2

WinDbg is probably loading the wrong version of the mscorwks DLL. Try using .cordll -lp to explicitly tell WinDbg which CLR debug modules it should load, see also this blog post: Issues Debugging Managed Code in WinDbg with SOS and PSSCOR2 (e.g. "Failed to request ThreadStore")

floyd73
  • 1,240
  • 9
  • 12
  • Tried it, didn't help. I'm starting to think that the minidump type may be the cause. – lysergic-acid Aug 10 '11 at 16:48
  • I think you're right, MiniDumpWithPrivateReadWriteMemory only may be insufficient for getting the information that you want.. – floyd73 Aug 10 '11 at 16:51
  • The information i want is the managed exception and possibly call stacks at the time of the crash. – lysergic-acid Aug 10 '11 at 17:00
  • Did you try "!analyze -v" ? This should give you details about the exception. And just curious, what does "~*" gives you? It should list all your (native) threads, you can then test if the thread information is there or not.. – floyd73 Aug 11 '11 at 06:57
  • ok, so I think that the good news is that you do have information about the running threads but I still think that the loaded symbols for .net are wrong (see the "Frame IP not in any known module. Following frames may be wrong" in the callstack from the analyze command output). – floyd73 Aug 11 '11 at 08:10
  • (continued, pushed enter by mistake): so it seems that the exception comes from thread ID 258 which is thread 0 and it's already the active thread so a "!clrstack" should now show you the managed code callstack and hopefully the code location where the exception was thrown (that is, assuming that the loaded symbols are correct) – floyd73 Aug 11 '11 at 08:11
  • i wish !clrstack would work :( if that was the case i wouldn't raise this question here. – lysergic-acid Aug 11 '11 at 08:14
  • it doesn't work either? (you mentioned only !Threads in your post). Well, then again I think that you still have the wrong symbols. Do you have access to a machine with the target .net version framework or another one with the exact same version installed? You may then copy mscorwks and related files from there and load that version explicitly in windbg. The fact that you have a "Frame IP not in any known module" in your callstack strongly suggests mismatching or non-existent symbols. Did you check the loaded symbols with "lm"? – floyd73 Aug 11 '11 at 08:27
  • I had MS public symbol server defined, it automatically downloads pdb's and the dll's needed. I am not sure whether i need to also configure Windbg to load these or it does that automatically. but nothing worked and i am still not able to run clrstack threads or any other SOS related commands. – lysergic-acid Aug 11 '11 at 08:47
  • I think that the symbol server is not enough, windbg needs to find also the exact same mscorwks and sos library versions as in the target machine. How are you loading the SOS extension? Did you try using ".loadby sos.dll mscorwks"? And did you try the PSSCOR2 extension? I normaly use it instead of SOS since it normaly has all I need (and more) and I found out that it is not so pedantic about the mscorwks version and it also includes commands like !clrstack. – floyd73 Aug 11 '11 at 09:08
  • i thik psscor2 uses sos behind the scenes? (i may be wrong on this). I tried loading sos both by .loadby and by loading specifically the SOS.dll. i took the version that matches mscorwks and loaded it but no good. – lysergic-acid Aug 11 '11 at 10:23
1

You need to change options you pass to 'MiniDumpWriteDump' make sure they contain options mentioned here: What is minimum MINIDUMP_TYPE set to dump native C++ process that hosts .net component to be able to use !clrstack in windbg

Community
  • 1
  • 1
Stanislav Berkov
  • 5,929
  • 2
  • 30
  • 36