Unless you have extensive experience with SynchronizationContext
, TaskScheduler
, TaskFactory
, async/await, and multiprocessing, there is no need to read further.
I've spent the last couple of years taking a large mostly-monolithic C# project that started with .NET 2.0 and has moved up through .NET 4.72 that we've been working on for the past 15 years and moving it to .NET 5/6 in order to get off 4.72 which has now been effectively deprecated, as well as to eventually host it on Linux, while keeping it running in production 24/7 with over a million users.
In the process, I've come to appreciate the elegance of the concept of async/await, but have also been very frustrated by the places where it does not behave sensibly.
I've come to terms with most of the quirks, but through this process have come to a point where .NET Core (.NET 6 at this point) appears to be a framework primarily for user-interactive software and not so much for other purposes such as queue processing.
It seems that the async/await in .NET Core was designed primarily to handle applications built around the old windows message pump UI and ASP.NET one-stream-of-execution-per-request, to the exclusion of some other architectures.
Let me elaborate.
Previous to these updates, much of our code did backend processing using a custom thread pool.
Given the nature of the project (half front-end web API services and half headless backend queue processing) and the need to optimize costs, we strive to hover near 97% CPU utilization on the backend processing servers, and we're leveraging parallel processing for some front-end operations as well.
We've found that for any system, if you get much higher than 97% CPU utilization, you lose the ability to monitor what's going on on the server, as well as to detect infinite loops when they inevitably happen.
We had algorithms that achieved approximately this level of utilization pretty well.
There are several reasons we made a custom thread pool:
- When many operations get queued (which is generally needed to keep the CPU in the optimal range) the default
ThreadPool
, because it is not FIFO, inevitably starves some long-queued operations in favor of more recently queued operations, causing timeouts and other difficult problems. Compensating for this is wasteful and tedious when a FIFO processing engine can avoid the timeouts by processing things in the far more sensible FIFO order. - It's extremely difficult (if not impossible) to determine when and how much work to queue to the default thread pool.
The overall system CPU utilization is low whether it is underloaded or overloaded, and there doesn't seem to be any combination of properties to indicate which of those states it is in (a combination ofPendingWorkItemCount
andThreadCount
almost does it, but in practice, I haven't been able to come up with an algorithm that's reliable with arbitrary workloads).
As a result, achieving sustained 95%+ CPU utilization using the default thread pool with anything but toy code has eluded multiple prolonged efforts over many years. With a custom thread pool, we can easily see whether we have more capacity or not, combine that with the current CPU utilization, and use that to decide whether or not to start more work.
Enter async.
The concept is wonderful, the promise of being able to keep the logical code flow mostly the same as it has been and stop wasting threads waiting for IO and thereby greatly reduce memory usage and gain some performance improvements as well has driven everyone this direction, so much so that many of the libraries we consume no longer have non-async versions to consume.
I have found no expert advice anywhere that recommends calling async code from non-async code under any circumstances.
So, the process of asyncification began, with the async zombie virus propagating up the code from the bottom up, with some temporary wrappers to fool the asyncified code to run synchronously until we got all the way to the top.
Inevitably we reached the code that used the custom thread pool, and so we attempted to convert it to async/await as well.
However, using our own custom ThreadPool
does not play well with async/await (there are numerous SO and blog posts about this).
As a result, I decided to write my own SynchronizationContext
/TaskScheduler
that used our custom thread pool.
There is very little documentation on how to do this properly (and none at all for many things), and I've spent months researching and implementing,
reading blogs and SO posts by Stephen Toub and Stephen Cleary, as well as combing through the reference source.
The implementation I've ended up with is much too large to include here.
It works mostly, but processing still seemed to get posted to the default thread pool, causing all sorts of the overloading and ambiguity in underload/overload state described above.
I combed through the project looking for patterns like Task t = MyAsyncCode();
and new Task(() => func())
that run code on the default TaskScheduler with no way to override that behavior. After eliminating all of those, problems with code running on the default TaskScheduler
remained.
I eventually tracked these issues down to the fact that using "await" on any significant code always switches processing back to the default task scheduler.
This is unacceptable due to the issues mentioned above.
Eventually, I came across this SO post: Understanding the behavior of TaskScheduler.Current.
After reading that and looking at the reference source for Task
/TaskScheduler
, etc. I see why the code behaves the way it does, but can't understand how this behavior makes any sense in any context outside the twisted world of ASP.NET and UI code. Does it really make sense for async work to completely ignore the 'current synchronization context' to which work could be queued and queue the work somewhere else instead?
This brings me to my question: Is there any way to create a custom TaskScheduler
that actually gets used for all the processing of an async function? If not, it seems like a gaping hole in the system.
Here is an example of what I'd like to get to run:
public async Task SOSample()
{
using MyTaskScheduler scheduler = MyTaskScheduler.Start();
MySynchronizationContext context = new(scheduler);
System.Threading.SynchronizationContext? oldContext = SynchronizationContext.Current;
try
{
SynchronizationContext.SetSynchronizationContext(context);
TaskCompletionSource<bool> completion = new();
Task unwrapped = await Task.Factory.StartNew(
() => VerifyTaskSchedulerRemainsCustom(), CancellationToken.None,
TaskCreationOptions.None, scheduler);
await unwrapped;
}
finally
{
SynchronizationContext.SetSynchronizationContext(oldContext);
}
}
private async Task VerifyTaskSchedulerRemainsCustom()
{
Assert.IsFalse(ReferenceEquals(TaskScheduler.Current, TaskScheduler.Default));
await Task.Yield();
Assert.IsFalse(ReferenceEquals(TaskScheduler.Current, TaskScheduler.Default));
await Task.Delay(100).ConfigureAwait(true);
Assert.IsFalse(ReferenceEquals(TaskScheduler.Current, TaskScheduler.Default));
// ... more arbitrary async processing
Assert.IsFalse(ReferenceEquals(TaskScheduler.Current, TaskScheduler.Default));
}
Alternatively, is there a way to determine whether the default thread pool is underloaded or overloaded and to determine what overloaded it without a debugger (crucial for debugging problems in production)? This would be a ton of work to switch out, but is preferable in some ways.