I know, how to get stacktraces of all existing threads.
Just gonna give a bit of background here.
In Windows, threads are an OS concept. They're the unit of scheduling. So there's a definite list of threads somewhere, since that's what the OS scheduler uses.
Furthermore, each thread has a call stack. This dates back to the early days of computer programming. However, the purpose of the call stack is often misunderstood. The call stack is used as a sequence of return locations. When a method returns, it pops its call stack arguments off the stack and also the return location, and then jumps to the return location.
This is important to remember because the call stack does not represent how code got into a situation; it represents where code is going it returns from the current method. The call stack is where the code is going to, not where it came from. That is the reason the call stack exists: to direct the future code, not to assist diagnostics. Now, it does turn out that the call stack does have useful information on it for diagnostics since it gives an indication of where the code came from as well as where it's going, so that's why call stacks are on exceptions and are commonly used for diagnostics. But that's not the actual reason why the call stack exists; it's just a happy circumstance.
Now, enter asynchronous code.
In asynchronous code, the call stack still represents where the code is returning to (just like all call stacks). But in asynchronous code, the call stack no longer represents where the code came from. In the synchronous world, these two things were the same, and the call stack (which is necessary) can also be used to answer the question of "how did this code get here?". In the asynchronous world, the call stack is still necessary but only answers the question "where is this code going?" and cannot answer the question "how did this code get here?". To answer the "how did this code get here?" question you need a causality chain.
Furthermore, call stacks are necessary for correct operation (in both the synchronous and asynchronous worlds), and so the compiler/runtime ensures they exist. Causality chains are not necessary, and they are not provided out of the box. In the synchronous world, the call stack just happens to be a causality chain, which is nice, but that happy circumstance doesn't carry over to the asynchronous world.
When a thread is released by await, the stacktrace and all objects along the call stack are stored somewhere.
No; this is not the case. This would be true if async
used fibers, but it doesn't. There is no call stack saved anywhere.
Because otherwise the continuation thread would lose context.
When an await
resumes, it only needs sufficient context to continue executing its own method, and potentially completing the method. So, there is an async
state machine structure that is boxed and placed on the heap; this structure contains references to local variables (including this
and method arguments). But that is all that is necessary for program correctness; a call stack is not necessary and so it is not stored.
You can easily see this yourself by setting a breakpoint after an await
and observing the call stack. You'll see that the call stack is gone after the first await
yields. Or - more properly - the call stack represents the code that is continuing the async
method, not the code that originally started the async
method.
At the implementation level, async
/await
is more like callbacks than anything else. When a method hits an await
, it sticks its state machine structure on the heap (if it hasn't already) and wires up a callback. That callback is triggered (invoked directly) when the task completes, and that continues executing the async
method. When that async
method completes, it completes its tasks, and anything await
ing those tasks are then invoked to continue executing. So, if a whole sequence of tasks complete, you actually end up with a call stack that is an inversion of the causality stack.
I would like to dump all stacktraces, even those not having its thread currently, so that I can find the deadlock.
So, there's a couple of problems here. First, there is no global list of all Task
objects (or more generally, tasklike objects). And that would be a difficult thing to get.
Second, for each asynchronous method/task, there's no causality chain anyway. The compiler doesn't generate one because it's not necessary for correct operation.
That's not to say either of these problems are insurmountable - just difficult. I've done some work on the causality chain problem with my AsyncDiagnostics library. It's rather old at this point but should upgrade pretty easily to .NET Core. It uses PostSharp to modify the compiler-generated code for each method and manually track causality chains.
However, the goal of AsyncDiagnotics is to get causality chains onto exceptions. Getting a list of all tasklikes and associating causality chains with each one is another problem, likely requiring the use of an attached profiler. I'm aware of other companies who have wanted this solution, but none of them have dedicated the time necessary to create one; all of them have found it more efficient to implement code reviews, auditing, and developer training.