1

I'm programming with Tasks and await/async. I assumed that the multithreading works like it does in NodeJS or Python, that is, it doesn't, everything just runs on the same thread. But I've been trying to learn how Tasks actually get executed and my understanding is that they're executed by TaskScheduler.Default who's implementation is hidden but can be expected to use a ThreadPool.

Should I be programming as if all my Tasks can run in any thread?

The extent of my asynchronous programming is fairly lightweight CPU work consisting of several infinite loops that do work and then await on Task.Delay for several seconds. Right now the only shared resources is an int that increments every time I write a network message but in the future I expect my tasks will be sharing Dictionaries and Lists.

I also have a network Task that connects to a TCP server and reads messages using a Task I implemented on BeginRead+EndRead. The Read function is called by an infinite loop that reads a messages, processes it, then reads a new message.

        void OnRead(IAsyncResult result)
        {
            var pair = (Tuple<TaskCompletionSource<int>, NetworkStream>)result.AsyncState;
            int count = pair.Item2.EndRead(result);
            pair.Item1.SetResult(count);
        }

        async Task<byte[]> Read(NetworkStream stream, uint size)
        {
            var result = new byte[size];
            var count = 0;
            while(count < size)
            {
                var tcs = new TaskCompletionSource<int>();
                stream.BeginRead(result, count, result.Length - (int)count, new AsyncCallback(OnRead), Tuple.Create(tcs, stream));
                count += await tcs.Task;
            }
            return result;
        }

I write to the NetworkStream using synchronous writes.

user2690730
  • 253
  • 2
  • 10
  • 1
    Can you be more specific on what you want to do? There are ways to workaround things, but it's very dependent on the application you have. And a short answer would be to your question is, Yes. – Tolga Evcimen Dec 26 '19 at 18:07
  • There are two types of tasks. Tasks that offload CPU-bound work in background threads, and promise-style tasks for facilitating asynchronicity. The `TaskScheduler` is relevant only for the first type. What kind of tasks are you intending to implement yourself? – Theodor Zoulias Dec 26 '19 at 18:15
  • @TolgaEvcimen, I've added more detail. – user2690730 Dec 26 '19 at 18:26
  • @TheodorZoulias, The latter. One task is network bound and all the others use Task.Delay in loops to do work periodically. – user2690730 Dec 26 '19 at 18:27
  • Looks like you should be concerning with thread safety more than threads dispatching behavior. For your use case, "if I understood correctly", I'd suggest using timers instead of infinite loops, and thread safe data structures (eg. https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent?view=netframework-4.8), and locks, where necessary. – Tolga Evcimen Dec 26 '19 at 18:33
  • 2
    Why do you use BeginRead/EndRead and TaskCancellationSource, instead of directly using ReadAsync? This would greatly simplify your code as you don’t need the OnRead callback anymore and don’t need to declare and access the TaskCancellationSource. – ckuri Dec 26 '19 at 18:59
  • I had the same concerns a while ago. You may take a look at this question: [Asynchronous code, shared variables, thread-pool threads and thread safety](https://stackoverflow.com/questions/58264302/asynchronous-code-shared-variables-thread-pool-threads-and-thread-safety). TL;DR it seems that it is safe, and synchronization is redundant. – Theodor Zoulias Dec 26 '19 at 19:57
  • It seems, that you've mixed old `APM` model with `IAsyncResult` and latest TPL – Pavel Anikhouski Dec 27 '19 at 08:47

1 Answers1

2

I assumed that the multithreading works like it does in NodeJS or Python, that is, it doesn't, everything just runs on the same thread. But I've been trying to learn how Tasks actually get executed and my understanding is that they're executed by TaskScheduler.Default who's implementation is hidden but can be expected to use a ThreadPool.

Not exactly.

First, Task in .NET can be two completely different things. Delegate Tasks represent code that can run on some thread, using a TaskScheduler to determine where and how they run. Delegate Tasks were introduced with the original Task Parallel Library and are almost never used with asynchronous code. The other kind of Task is Promise Tasks. These are much more similar to Promise in JavaScript: they can represent anything - they're just an object that is either "not finished yet" or "finished with a result" or "finished with an error". Here's a contrast of the different state diagrams for the different kinds of tasks.

So, the first thing to recognize is that just like you don't "execute a Promise" in JavaScript, you don't "execute a (Promise) Task" in .NET. So asking what thread it runs on doesn't make sense, since they don't run anywhere.

However, both JS and C# have an async/await language construct that allows you to write more natural code to control promises. When the async method completes, the promise is completed; if the async method throws, the promise is faulted.

So the question then becomes: where does the code run that controls this promise?

In the JavaScript world, the answer is obvious: there is only one thread, so that is where the code runs. In the .NET world, the answer is a bit more complex. My async intro gives the core concepts: every async method begins executing synchronously, on the calling thread, just like any other method. When it yields due to an await, it will capture its "context". Then, when that async method is ready to resume after the await, it resumes within that "context".

The "context" is SynchronizationContext.Current, unless it is null, in which case the context is TaskScheduler.Current. In modern code, the "context" is usually either a GUI thread context (which always resumes on the GUI thread), or the thread pool context (which resumes on any available thread pool thread).

Should I be programming as if all my Tasks can run in any thread?

The code in your async methods can resume on a thread pool thread if it's called without a context.

Do I need to synchronize resource access between Tasks

Probably not. The async and await keywords are designed to allow easy writing of serial code. So there's no need to synchronize code before an await with code after an await; the code after the await will always run after the code before the await, even if it runs on a different thread. Also, await injects all necessary thread barriers, so there's no issues around out-of-order reads or anything like that.

However, if your code runs multiple async methods at the same time, and those methods share data, then that would need to be synchronized. I have a blog post that covers this kind of accidental implicit parallelism (at the end of the post). Generally speaking, asynchronous code encourages returning results rather than applying side effects, and as long as you do that, implicit parallelism is less of a problem.

Stephen Cleary
  • 437,863
  • 77
  • 675
  • 810