1

I have a simple TCP server that is able to listen for and accept multiple connections on a port. It then continuously waits for data to read from its connections. It uses a wrapper class for a TcpClient called ConnectedClient for convenience, and a list (dictionary) of ConnectedClients to keep track of all the connections. It basically goes like this:

/* this method waits to accept connections indefinitely until it receives 
   the signal from the GUI thread to stop. When a connection is accepted, it 
   adds the connection to the list and calls a method called ProcessClient, 
   which returns almost immediately.*/
public void waitForConnections() {
        // this method has access to a TcpListener called listener that was started elsewhere
        try {
            while (!_abort) {
                TcpClient socketClient = listener.AcceptTcpClient();

                //Connected client constructor takes the TcpClient as well as a callback that it uses to print status messages to the GUI if   
                ConnectedClient client = new ConnectedClient(socketClient, onClientUpdate);
                clients.Add(client.id, client);
                ProcessClient(client);
            }
        }
        catch (Exception e) {
            onStatusUpdate("Exception Occurred: " + e.Message);
        }
    }

    /* This method doesn't do much other than call BeginRead on the connection */
    private void ProcessClient(ConnectedClient client) {
        try {
            // wrapper class contains an internal buffer for extracting data as well as a TcpClient
            NetworkStream stream = client.tcpClient.GetStream();
            stream.BeginRead(client.buffer, 0, client.tcpClient.ReceiveBufferSize, new AsyncCallback(StreamReadCompleteCallback), client);
        }
        catch (Exception ex) {
            onStatusUpdate(ex.Message);
        }
    }

In my callback function, StreamReadCompleteCallback, I call EndRead, checking the return value of EndRead to detect whether the connection has been closed. If the return value is greater than zero, I extract/process the read data and call BeginRead again on the same client. If the return value is zero, the connection has been closed and I remove the connection (delete from list, close the TcpClient, etc).

    private void StreamReadCompleteCallback(IAsyncResult ar) {
        ConnectedClient client = (ConnectedClient)ar.AsyncState;

        try {
            NetworkStream stream = client.tcpClient.GetStream();

            int read = stream.EndRead(ar);
            if (read != 0) {
                // data extraction/light processing of received data
                client.Append(read);
                stream.BeginRead(client.buffer, 0, client.tcpClient.ReceiveBufferSize, new AsyncCallback(StreamReadCompleteCallback), client);
            }
            else {
                DisconnectClient(client);
            }
        }
        catch (Exception ex) {
            onStatusUpdate(ex.Message);
        }
    }

All of this works fine, I can accept connections and read from multiple client devices, etc.

My question is: This method of continuously reading from connected clients causes each connection to have a worker thread that is waiting for BeginRead to return.

So if I have 10 connections, I have 10 BeginReads going.

It seems wasteful to have so many worker threads sitting around waiting to read. Is there some other, better way to accomplish this? I eventually run out of memory to add connections if I have a high number of active connections.

Would having a thread that polls the DataAvailable property of each connection until something shows up, and then makes a thread to read/process be a solution?

Or is creating all these worker threads not as big of a deal as I think?

Cobalt
  • 938
  • 9
  • 21

1 Answers1

1

This method of continuously reading from connected clients causes each connection to have a worker thread that is waiting for BeginRead to return

No, it doesn't. In fact, using BeginRead() or one of the other asynchronous alternatives to processing I/O on a Socket object is the most scalable approach to use.

Would having a thread that polls the DataAvailable property of each connection until something shows up, and then makes a thread to read/process be a solution?

No. This would be horrible. Polling a socket, via DataAvailable or Select(), is terribly inefficient, forcing a huge amount of CPU time to be invested just checking on the socket state. The OS provides good asynchronous mechanisms for handling this; a polling implementation ignores that and does all the work itself.

Or is creating all these worker threads not as big of a deal as I think?

You aren't creating the threads you think you are. When you use the asynchronous APIs, they make use of a feature in windows called I/O Completion Ports. An I/O Completion Port is associated with an I/O operation, and a thread can wait on a port. But one thread can handle waiting on a large number of operations, so having ten outstanding read operations does not in fact cause ten different threads to be created.

.NET manages a pool of threads to handle these operations, managed as part of the ThreadPool class. You can monitor that class to see the behavior of the IOCP pool (which is different from the worker thread pool used for QueueUserWorkItem()).

.NET will assign new IOCP objects and threads as needed to service your network I/O operations. You can rest assured that it will do so in a reasonable, efficient manner.

At very large scales, the overhead of the garbage collection of the objects associated with read operations may come into play. In this case, you can use the ReceiveAsync() method, which allows you to reuse your own pool of state objects for the operations, so that you aren't constantly creating and discarding objects.

Another issue that may come up is memory fragmentation, especially in the large-object heap (depending on the size of the buffers you use). When you start a read operation on a socket, the buffer has to be pinned, preventing .NET from compacting the heap in which it resides.

But these issues aren't reasons to avoid using the asynchronous APIs (and in fact, the second issue happens regardless). They are just things to be aware of. Using the asynchronous API is in fact the best way to go.

That said, BeginReceive() is "old school". It works, but you can wrap a BeginReceive() operation in a Task (see Task.FromAsync() and TPL and Traditional .NET Framework Asynchronous Programming), or you can wrap the entire Socket in a NetworkStream object (which has ReadAsync() and similar methods), which will allow you to write your asynchronous code in a more readable way that doesn't require the use of explicit callback methods. And for scenarios where the network I/O always culminates in some interaction with the UI, allows you to use async/await to do so, again in a more readable, easier-to-write way.

Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • I am developing on a compact framework, and can't use any async/await/tasks/anything fun. So as far as I was aware, BeginRead, EndRead, and threads are my only options as far as asynchronous operations go. Is BeginReceive similar to BeginRead but on a socket? – Cobalt Jun 08 '17 at 18:03
  • 1
    Your question doesn't provide any indication at all that you're using CF. That said, yes...`BeginReceive()` and `BeginRead()` are equivalent operations. I don't know whether CF has the full set of IOCP features as the desktop OS, but I'm confident the async API will still be preferable to the alternatives. – Peter Duniho Jun 08 '17 at 18:05
  • Good point. Edited to reflect that. And I am attempting to verify the IOCP, but all I can see currently in the debugger is that when I call BeginRead, a new worker thread shows up in the pane. Any suggestions on how to approach this? – Cobalt Jun 08 '17 at 18:14
  • The importance of IOCP is a concern primarily when dealing with large-scale servers. Dozens of connections, at least, if not a lot more. CF is clearly not an appropriate platform for _any_ type of large-scale server implementation, so that should not be your primary concern. You may well see a new thread the first time you call `BeginRead()`, but that would be normal, as the IOCP thread pool does need to be populated. I doubt you'd see ten threads for ten reads, but even if you did, I would trust that CF knows what it's doing. Worry if and when you _see_ a genuine problem that needs fixing. – Peter Duniho Jun 08 '17 at 18:21
  • This was my initial thought, but it seems each `BeginRead` adds a new thread, regardless of the number of threads/reads/connections. Perhaps IOCP just does not perform as expected on CF as the platform is intrinsically infeasible for large scale servers? In which case I suppose the best option is to set a hard cap on the number of connections. – Cobalt Jun 08 '17 at 18:26
  • I can't tell you for sure whether IOCP is supported on CF. [This guy](https://stackoverflow.com/a/21498701) seems very sure it's _not_ supported. Still, I would stick with the async API until it's a known problem. For all I know, CF also makes threads lighter-weight than on the desktop OS and dedicating one per read operation isn't a problem. [This post](https://stackoverflow.com/a/20091186) echoes my sentiments. It's not about CF per se, but it's the same basic issue. – Peter Duniho Jun 08 '17 at 18:30