I want to differentiate between crashing,hanging and normal kill of an app? Like we have to do registry for WER to create crash dump, os send some signal to process if anything happen so how to handle this all and create a library that would assit in bucketing it according to crash or hang or simple kill? Is there
1 Answers
I want to differentiate between crashing, hanging and normal kill of an app?
You're missing the following options:
- app works normally
- app is about to crash, but maybe doesn't
And these two make it really hard to distinguish the states. In order to understand that, you need to know two things:
- exception dispatching
- how a crash dump is generated
Exception Dispatching
A crash is caused by an exception. But not all exceptions will cause a crash, because exceptions can be handled. Handling of an exception is typically done in a catch{}
block.
So, imagine an exception occurs in your application. The following process begins:
- if a debugger is attached, ask the debugger whether it want to react on that. This is the first chance for the debugger to do something.
- if the debugger did not want to react, check for a
catch{}
block which might want to react on the exception. - If there was no
catch{}
block, check for a so called "unhandled exception handler" which might want to react on the exception. - if still nobody wanted to handle the exception, ask the debugger again. This is now the second chance for the debugger to do something.
- if the debugger doesn't do anything, the OS needs to handle the situation. If some WER settings are enabled, it might save a crash dump now. After that, it will terminate the process and free the resources that were allocated by the app.
The terms "first chance exception" and "second chance exception" are important.
WinDbg tells you about this:
0:006> g
(2db0.2908): CLR exception - code e0434352 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=0098ebe0 ebx=00000005 ecx=00000005 edx=00000000 esi=0098eca4 edi=00000001
eip=76c44402 esp=0098ebe0 ebp=0098ec3c iopl=0 nv up ei pl nz ac po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000212
KERNELBASE!RaiseException+0x62:
76c44402 8b4c2454 mov ecx,dword ptr [esp+54h] ss:002b:0098ec34=5d02fd68
0:000>
As you can see, this exception is a first chance exception. WinDbg says
First chance exceptions are reported before any exception handling.
This means: the debugger has reacted before any catch{}
block was run. And:
This exception may be expected and handled.
This means: the code may have a catch{}
block, which does something useful so that the application might not crash.
A second chance exception looks like this:
0:000> g
(3e34.36c0): C++ EH exception - code e06d7363 (first chance)
(3e34.36c0): C++ EH exception - code e06d7363 (!!! second chance !!!)
eax=00daf940 ebx=00000000 ecx=00000003 edx=00000000 esi=00000001 edi=00000000
eip=76c44402 esp=00daf940 ebp=00daf998 iopl=0 nv up ei pl nz ac po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000212
KERNELBASE!RaiseException+0x62:
76c44402 8b4c2454 mov ecx,dword ptr [esp+54h] ss:002b:00daf994=0754642c
As you can see, there was a first chance exception before, but I instructed the debugger not to do anything at this point. The application did neither have a catch{}
block nor an unhandled exception handler. Without a debugger, this application would crash and terminate.
How are crash dumps created
Crash dumps are created very similarly like a debugger creates crash dumps.
- Attach a debugger to the process
- Start a new thread
- In that thread, force a known exception
- When the debugger is informed about the first chance exception, create the crash dump file
The exception that is forced is typically a INT 3
instruction, which is a debugging breakpoint with exception code 0x80000003.
Identifying a crash
You have a crash when there is an exception and the exception cannot be continued.
In WinDbg you can use .exr -1
to get information about the last exception.
0:000> .exr -1
ExceptionAddress: 76c44402 (KERNELBASE!RaiseException+0x00000062)
ExceptionCode: e06d7363 (C++ EH exception)
ExceptionFlags: 00000001
With ExceptionFlags
being 1, the exception is non-continuable.
Identifying a potential crash (but maybe it doesn't)
As before, but Exception flags is 0.
Identifying a kill
This is not easily possible. The OS will terminate the process. There's no exception. You'll typically not have a crash dump of this situation.
However, there are tools that can stop when a process terminates. But there's not much to analyze then. You would identify such a situation by having a look at the call stack:
0:000> k L1
# Child-SP RetAddr Call Site
00 0000003a`d2d3f968 00007fff`3b16a938 ntdll!NtTerminateProcess+0x14
Typically, there is just one thread left:
0:000> ~
. 0 Id: 2078.34ec Suspend: 0 Teb: 0000003a`d2e03000 Unfrozen
App works normally
In this case, the exception code will be 0x80000003, because a breakpoint was injected in order to generate the crash dump.
0:004> .exr -1
ExceptionAddress: 77964120 (ntdll!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 1
Parameter[0]: 00000000
From the call stack, you typically see that is was injected by a debugger:
0:004> k L2
# ChildEBP RetAddr
00 0666fd34 7799ace9 ntdll!DbgBreakPoint
01 0666fd64 754c6359 ntdll!DbgUiRemoteBreakin+0x39
The main thread is typically doing nothing, i.e. it's waiting for user input
0:004> ~0k L1
# ChildEBP RetAddr
00 008fef50 6437a188 win32u!NtUserWaitMessage+0xc
App is hanging
A hang looks very much like a normal running app, because the process of generating the crash dump does the same:
0:004> .exr -1
ExceptionAddress: 77964120 (ntdll!DbgBreakPoint)
ExceptionCode: 80000003 (Break instruction exception)
ExceptionFlags: 00000000
NumberParameters: 1
Parameter[0]: 00000000
0:004> k L2
# ChildEBP RetAddr
00 0666fd34 7799ace9 ntdll!DbgBreakPoint
01 0666fd64 754c6359 ntdll!DbgUiRemoteBreakin+0x39
There are two types of hang: a high CPU hang (the app is busy, maybe in an endless loop) or a low CPU hand (the app has deadlocked).
A high CPU hang can be identified by its call stack. It may not have a WaitForSingleObject()
or WaitForMultipleObjects()
method on top of the stack.
A low CPU hang may look exactly identical like a working app, because it is waiting as well. The only difference is: the working app is waiting for user input (which may occur soon) and the hanging app is waiting for a something else (which is may never get and thus deadlock).
The reality
The reality can be much more complex, depending on whether .NET is involved, you have multiple UI threads, etc. But IMHO, in a straight-forward app, this approach should work in ~70% of the cases.

- 55,411
- 20
- 125
- 222
-
Thank you so much it was great help, but when an app runs normally or during hang how will I get a dump or something, how will I analyze it, the analysis code u added, how did u get it. – Prachi Jun 11 '20 at 19:41
-
And as i was asking you for the bucketing one actually I have to create a standalone library that would assist in right bucketing of existing crash hang or normal kill means I think os gives some signals and we can understand from there also it is a kill or hang. – Prachi Jun 11 '20 at 19:44
-
@Prachi: The term "crash dump" is a bit misleading. It's just the saved state of a program, whether it crashed or not. A lot of tools can generate a crash dump just when you tell them to do it. For example, WinDbg will write a crash dump with the `.dump` command, even if there is no crash. Task Manager can do it, ProcDump, Visual Studio, ... See also https://stackoverflow.com/questions/24874027/how-do-i-take-a-good-crash-dump-for-net – Thomas Weller Jun 11 '20 at 19:44
-
The only signals I can think of are `WM_CLOSE` (a window message, which can be handled in a message loop) and the exception (which can be handled by a catch-block). If the app is killed, there's no signal. – Thomas Weller Jun 11 '20 at 19:48
-
@Prachi: if you want to implement bucketing yourself, there are a lot of options. There is [PyKD](https://githomelab.ru/pykd) (Python), [ClrMD](https://github.com/microsoft/clrmd) (.NET). You can also implement [in C++](https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/data-model-cpp-overview). I can't tell what the best is for you. I recommend starting with some clean crashes, i.e. little example programs that do nothing else but crash. Then write some software to classify them as crashes. – Thomas Weller Jun 11 '20 at 20:09
-
Then, write a little software that causes a deadlock. Adapt your classifier to distinguish between crashes and deadlocks. Next, take a few crash dumps when nothing happened and the UI is just waiting for input. Make your classifier distinguish all three. Next, create some crash dumps at first chance exceptions (e.g. using procdump). – Thomas Weller Jun 11 '20 at 20:11
-
Also get some crash dumps of other applications. Get crash dumps of blue screens (kernel mode). And adapt your classifier to bucket as "not our problem". Until here, this should take ~1 week for someone with a bit of expertise. – Thomas Weller Jun 11 '20 at 20:13
-
Next, run the classifier on the first 10 or 20 crash dumps you have. Compare the results to your analysis with WinDbg. If the classifier is wrong, write more code to handle these cases. I would consider 1 day per crash dump, so this takes another 2 weeks. Finally, run the classifier over all crash dumps you have. Take a sample of ~20 crash dumps that you check for correctness. – Thomas Weller Jun 11 '20 at 20:16
-
Can you share with me the official doc from where you deduced this differentiation please ? – Prachi Jun 12 '20 at 12:03
-
@Prachi: sorry, I don't know any. It's the knowledge I got in the last 10 years of working with WinDbg. – Thomas Weller Jun 12 '20 at 14:15
-
Sir, any particular term like error code or something which is same for all kind of exception and watching that I can say this is crash or hang. Like here you told about difference in first exception and second but by flag but for hang also the flag is zero. If I have a collection of dumps of hang and crash how to identify which is belongs hang one or crash. – Prachi Jun 12 '20 at 17:53