Recursively call async method vs long running task when receiving data from TcpClient

Question

I'm currently rewriting my TCP server from using StreamSocketListener to TcpListener because I need to be able to use SSL. Since it was some time ago that I wrote the code I'm also trying to make it more cleaner and easier to read and hopefully increase the performance with higher number of clients but I'm currently stuck.

I'm calling a receive method recursively until the client disconnects but I'm starting to wonder if it wouldn't be a better to use a single long running task for it. But I hesitate to use it since it will then create a new long running task for every connected client. That's why I'm turning to the Stack Overflow community for some guidance on how to proceed.

Note: The connection is supposed to be open 24/7 or as much as possible for most of the connected clients.

Any comments are appreciated.

The current code looks something like this:

private async Task ReceiveData(SocketStream socket) {
    await Task.Yield();
    try {
        using (var reader = new DataReader(socket.InputStream)) {
            uint received;
            do {
                received = await reader.LoadAsync(4);
                if (received == 0) return;
            } while (reader.UnconsumedBufferLength < 4);

            if (received == 0) return;

            var length = reader.ReadUInt32();
            do {
                received = await reader.LoadAsync(length);
                if (received == 0) return;
            } while (reader.UnconsumedBufferLength < length);

            if (received == 0) return;

            // Publish the data asynchronously using an event aggregator
            Console.WriteLine(reader.ReadString(length));
        }
        ReceiveData(socket);
    }
    catch (IOException ex) {
        // Client probably disconnected. Can check hresult to be sure.
    }
    catch (Exception ex) {
        Console.WriteLine(ex);
    }
}

But I'm wondering if I should use something like the following code instead and start it as a long running task:

// Not sure about this part, never used Factory.StartNew before.
Task.Factory.StartNew(async delegate { await ReceiveData(_socket); }, TaskCreationOptions.LongRunning);

private async Task ReceiveData(SocketStream socket) {
    try {
        using (var reader = new DataReader(socket.InputStream)) {
            while (true) {
                uint received;
                do {
                    received = await reader.LoadAsync(4);
                    if (received == 0) break;
                } while (reader.UnconsumedBufferLength < 4);

                if (received == 0) break;

                var length = reader.ReadUInt32();
                do {
                    received = await reader.LoadAsync(length);
                    if (received == 0) break;
                } while (reader.UnconsumedBufferLength < length);

                if (received == 0) break;

                // Publish the data asynchronously using an event aggregator
                Console.WriteLine(reader.ReadString(length));
            }
        }
        // Client disconnected.
    }
    catch (IOException ex) {
        // Client probably disconnected. Can check hresult to be sure.
    }
    catch (Exception ex) {
        Console.WriteLine(ex);
    }
}

Neither of those implementations is correct. You should not rely on the `Connected` property, but instead should check for a 0 byte receive operation to complete, and handle exceptions (i.e. if the socket is closed forcefully). — Peter Duniho, Nov 02 '15 at 02:50
Without [a good, _minimal_, _complete_ code example](http://stackoverflow.com/help/mcve) clearly illustrating your question, it's not possible to offer specific advice about the implementation, but the latter seems more likely to be correct and useful, assuming you eventually await the completion of the `ReceiveData()` method somewhere. The "recursive" version will build a chain of tasks, consuming memory, while the other one can be represented as a single awaitable task — Peter Duniho, Nov 02 '15 at 02:50
@PeterDuniho I've updated the samples to better match the real versions. I didn't think I had to go into the examples that much since the question is more about long running tasks vs recursive methods. Also I wrote Client.Connected instead of TcpClient.Connected because client is a custom class which includes a Connected bool value that changes depending on the conneciton just to show that we only called the method if we still where connected or keept looping while connected but I probably should've explained it better from the beginning. — Twyzz, Nov 02 '15 at 19:49
The code doesn't change much about my previous comments. The "recursive" approach still has the "chain of tasks" issue (assuming some code somewhere is awaiting the initial `ReceiveData()` call), and the "long-running" version is not actually long-running, so you're not actually committing a dedicated thread to each client. IMHO, the latter is preferable. — Peter Duniho, Nov 02 '15 at 20:50
@PeterDuniho None of the ReceiveData methods are awaited for during the "recursive" approach so I don't see the issue, it will just get called until it gets to the yield state and then it will return and close the previous thread and how come the other alternative is not a long running task? is will loop forever until the client disconnects, that happens very rarly which in my opinion sounds like a long running task. It's very hard for me to see the issues that you see :( — Twyzz, Nov 02 '15 at 21:49
If you are not awaiting awaitable calls, that's a problem. Your choice, but well-written code will have a way to catch exceptions and deal with them. You are right though...if you are willing to ignore errors, you can get away without the memory overhead. As far as your second alternative goes, "loop forever" does not mean the task itself runs a long time. When you await the asynchronous operation, the method returns and the thread continues on with whatever it was doing before. There is no thread dedicated to waiting for the operation to complete. — Peter Duniho, Nov 02 '15 at 22:14
@PeterDuniho Thanks for all the help, I think I've decided to stick with the "recursive" method since I already catch the exceptions inside the method itself so I don't really need to catch them once more outside. You can always create an answer if you want me to accept it. — Twyzz, Nov 02 '15 at 22:39

score 0 · Accepted Answer · edited May 23 '17 at 12:23

In the first, over-simplified version of the code that was posted, the "recursive" approach had no exception handling. That in and of itself would be enough to disqualify it. However, in your updated code example it's clear that you are catching exceptions in the async method itself; thus the method is not expected to throw any exceptions, and so failing to await the method call is much less of a concern.

So, what else can we use to compare and contrast the two options?

You wrote:

I'm also trying to make it more cleaner and easier to read

While the first version is not really recursive, in the sense that each call to itself would increase the depth of the stack, it does share some of the readability and maintainability issues with true recursive methods. For experienced programmers, comprehending such a method may not be hard, but it will at the very least slow down the inexperienced, if not make them scratch their heads for awhile.

So there's that. It seems like a significant disadvantage, given the stated goals.

So what about the second option, about which you wrote:

…it will then create a new long running task for every connected client

This is an incorrect understanding of how that would work.

Without delving too deeply into how async methods work, the basic behavior is that an async method will in fact return at each use of await (ignoring for a moment the possibility of operations that complete synchronously…the assumption is that the typical case is asynchronous completions).

This means that the task you initiate with this line of code:

Task.Factory.StartNew(
    async delegate { await ReceiveData(_socket); },
    TaskCreationOptions.LongRunning);

…lives only long enough to reach the first await in the ReceiveData() method. At that point, the method returns and the task which was started terminates (either allowing the thread to terminate completely, or to be returned to the thread pool, depending on how the task scheduler decided to run the task).

There is no "long running task" for every connected client, at least not in the sense of there being a thread being used up. (In some sense, there is since of course there's a Task object involved. But that's just as true for the "recursive" approach as it is for the looping approach.)

So, that's the technical comparison. Of course, it's up to you to decide what the implications are for your own code. But I'll offer my own opinion anyway…

For me, the second approach is significantly more readable. And it is specifically because of the way async and await were designed and why they were designed. That is, this feature in C# is specifically there to allow asynchronous code to be implemented in a way that reads almost exactly like regular synchronous code. And in fact, this is borne out by the false impression that there is a "long running task" dedicated to each connection.

Prior to the async/await feature, the correct way to write a scalable networking implementation would have been to use one of the APIs from the "Asynchronous Programming Model". In this model, the IOCP thread pool is used to service I/O completions, such that a small number of threads can monitor and respond to a very large number of connections.

The underlying implementation details actually do not change when switching over to the new async/await syntax. The process still uses a small number of IOCP thread pool threads to handle the I/O completions as they occur.

The difference is that the when using async/await, the code looks like the kind of code that one would write if using a single thread for each connection (hence the misunderstanding of how this code actually works). It's just a big loop, with all the necessary handling in one place, and without the need for different code to initiate an I/O operation and to complete one (i.e. the call to Begin...() and later to End...()).

And to me, that is the beauty of async/await, and why the first version of your code is inferior to the second. The first fails to take advantage of what makes async/await actually useful and beneficial to code. The second takes full advantage of that.

Beauty is, of course, in the eye of the beholder. And to at least a limited extent, the same thing can be said of readability and maintainability. But given your stated goal of making the code "cleaner and easier to read", it seems to me that the second version of your code serves that purpose best.

Assuming the code is correct now, it will be easier for you to read (and remember, it's not just you today that needs to read it…you want it readable for the "you" a year from now, after you haven't seen the code for awhile). And if it turns out the code is not correct now, the simpler second version will be easier to read, debug, and fix.

The two versions are in fact almost identical. In that respect, it almost doesn't matter. But less code is always better. The first version has two extra method calls and dangling awaitable operations, while the second replaces those with a simple while loop. I know which one I find more readable. :)

Related questions/useful additional reading:
Long Running Blocking Methods. Difference between Blocking, Sleeping, Begin/End and Async
Is async recursion safe in C# (async ctp/.net 4.5)?

Recursively call async method vs long running task when receiving data from TcpClient

1 Answers1