How to Divide Work bewteen Threads, for a MultiClient TCP Server in C#?

Question

When creating a MultiClient Server in C#, I can think of several ways to divide the work between the threads.

For this question, assume that the server accepts incoming TCP connections from the clients, and each client sends a file for the server, to store on the HardDrive.

Work Division 1: Thread per Client: The server will instantiate a new Thread per each new client that connects, and that thread will be responsible for that client. (those threads, are in addition to 1 "server thread")

Work Division 2: Thread per Resource: There will be 1 thread for handling the communication, and 1 thread for writing to the harddrive. A client object will be passed between each of these Resource-Responsible threads, and each Resource-Responsible thread will have its own queue so it can know what it should do. (and again, those threads are in addition to 1 "server thread")

Work Division 3: Thread per Sub-Task of the Main Task: Let's call the work that we need to do per each connecting client, as "the main task". So we'll break this main tasks into several Sub-Tasks, and create a thread for each sub-task, and again each thread will have a queue that will hold the client items that it should process. (this sounds similar to division 2, but in another project with the different work rather than a File Receiving Server, this type of division might be quite different from division 2)

My question is:

Are there other ways that are recommended for dividing the work between the threads?

Why do you so want to dedicate threads to specific things? Why not just use an "any thread can do any job" model where whatever thread happens to be running does whatever is the most important thing that needs to be done? — David Schwartz, Dec 07 '15 at 11:46
Hi David. This is also a possible option of course. My goal in this question is to learn different ways of splitting the work, so the server can handle more than 1 client, and later on so the server can handle many clients.. If I understand it correctly, for divisions 2 and 3, your remark is something that comes after deciding of how to divide the work(like If someone chooses to use the Thread class or the ThreadPool). — spaceman, Dec 07 '15 at 12:36
All three of your options are a "thread per thing in the problem" approach. I'm suggesting no link whatsoever between aspects of the problem and threads. — David Schwartz, Dec 07 '15 at 12:37
I understand. But assume this: Let's say you wrote all the code to handle a client, in 1 long method. Now you can take this method and give it to the ThreadPool so it will run it, but, you can also divide this method to several parts, and then give them to the ThreadPool to perform them. (in case there is a logical order for these parts, like in my case, then of course we need Queues and such). What I am saying is that your remark is about a later implementation stage, while muyquestion is about a prior, higher, design stage. — spaceman, Dec 07 '15 at 12:51
It's not a higher stage, it's a lower stage. You must put function boundaries anywhere which thread is doing the work can change. And you may put function boundaries other places regardless of how you are using threads. So how you are mapping jobs to threads determines where your method boundaries must be, not the other way around. With "any thread can do any job", method boundaries are required any place you might need to block, because that's the only time you must change threads. — David Schwartz, Dec 07 '15 at 12:57
I think you have not completely understood my point in the previous comment, and only took the "dividing a long function to several functions part". Let me give an example for what I mean in the prior (higher) design stage, vs a later implementation stage like using a ThreadPool etc: Let's say you create one thread for handing the Data Communication, and one thread for Disk Writing, and so on according to the resources you use. You need each such thread to have its own Queue, and you need each client to have a State Machine object indicating in what stage it is in. So you see what I mean? — spaceman, Dec 07 '15 at 13:04
Yes. So your first decision is what will be bound to a thread and that dictates what queues and such you need. I'm saying to make that decision by not binding anything to a thread and allowing a thread to continue doing work until it would need to block. And that first decision will determine all those later decisions. — David Schwartz, Dec 07 '15 at 13:06
For example, with thread per client, you change threads when you change clients, and so you need queuing for operations that affect more than one client. With thread per resource, you change threads when you change resources, so you need queuing (or dispatch) when an operation changes resources. With "any thread can do any job", you need queuing or dispatch only when you might block. (And a huge benefit of this design is that you can switch clients or switch resources or switch sub-tasks without having to dispatch.) — David Schwartz, Dec 07 '15 at 13:08

jgauffin · Accepted Answer · 2015-12-07T11:28:53.030

The answer is that you do not work with threads at all unless you have few connecting clients. The reason is that threading comes with an overhead and that the threads will be idle a large part of it's time since you work with slow resources (IO).

Instead you should look at asynchronous programming. In dotnet you have three models:

APM (Asynchronous programming model)
Event-based Asynchronous Pattern (EAP)
Task-based Asynchronous Pattern (TAP)

https://msdn.microsoft.com/en-us/library/jj152938(v=vs.110).aspx

APM is the oldest one. I only recommend it if your dotnet version doesn't support EAP or TAP. But in your case, you need to use APM (.NET 2.0) and can read more about it here: https://msdn.microsoft.com/en-us/library/ms228963(v=vs.110).aspx

When you've using async programming you do not have to worry about threads anymore. .NET and the OS will manage the threads. Your application will "awake" when something have completed in the IO operations that you've ordered (like sending something over a socket or read from a database).

I would like to implement, whichever solution chosen, using the Thread class, and not using other classes/tools from the .NET framework.

Only #1 is viable. Writing to disk will be faster than receiving the file over the network. So there is really no reason to let the resources own the threads.

Using #1 will also reduce the complexity and make the code easier to read. You will still need a service to make sure that two clients doesn't work with the same file.

Thank you jgauffin. I understood your answer, but the first sentence was abit surprising: "unless you have few connecting clients." - but I do have several connecting clients.... so what did you mean? — spaceman, Dec 07 '15 at 12:38
You didn't state the amount of clients anywhere. Using threads when handling several clients is a waste of resources. — jgauffin, Dec 07 '15 at 13:27
Regarding "You didn't state the amount of clients anywhere" - Right. OK 2 scenarios. The first 10 clients concurrently. The second scenario 1000 clients concurrently. — spaceman, Dec 07 '15 at 13:52
I wouldn't use threads for 1000 clients due to the threading overhead. Read here: http://stackoverflow.com/questions/28656872/why-is-stack-size-in-c-sharp-exactly-1-mb and https://en.wikipedia.org/wiki/Context_switch — jgauffin, Dec 07 '15 at 15:05

How to Divide Work bewteen Threads, for a MultiClient TCP Server in C#?

1 Answers1