I want to compare two theoretical scenarios. I have simplified the cases for purpose of the question. But basically its your typical producer consumer scenario. (I'm focusing on the consumer).
I have a large Queue<string> dataQueue
that I have to transmit to multiple clients.
So lets start with the simpler case:
class SequentialBlockingCase
{
public static Queue<string> DataQueue = new Queue<string>();
private static List<string> _destinations = new List<string>();
/// <summary>
/// Is the main function that is run in its own thread
/// </summary>
private static void Run()
{
while (true)
{
if (DataQueue.Count > 0)
{
string data = DataQueue.Dequeue();
foreach (var destination in _destinations)
{
SendDataToDestination(destination, data);
}
}
else
{
Thread.Sleep(1);
}
}
}
private static void SendDataToDestination(string destination, string data)
{
//TODO: Send data using http post, instead simulate the send
Thread.Sleep(200);
}
}
}
Now this setup works just fine. It sits there and polls the Queue
and when there is data to send it sends it to all the destinations.
Issues:
- If one of the destinations is unavailable or slow, it effects all of the other destinations.
- It does not make use of multi threading in the case of parallel execution.
- Blocks for every transmission to each destination.
So here is my second attempt:
class ParalleBlockingCase
{
public static Queue<string> DataQueue = new Queue<string>();
private static List<string> _destinations = new List<string>();
/// <summary>
/// Is the main function that is run in its own thread
/// </summary>
private static void Run()
{
while (true)
{
if (DataQueue.Count > 0)
{
string data = DataQueue.Dequeue();
Parallel.ForEach(_destinations, destination =>
{
SendDataToDestination(destination, data);
});
}
else
{
Thread.Sleep(1);
}
}
}
private static void SendDataToDestination(string destination, string data)
{
//TODO: Send data using http post
Thread.Sleep(200);
}
}
This revision at least does not effect the other destinations if 1 destination is slow or unavailable.
However this method is still blocking and I am not sure if Parallel.ForEach
makes use of the thread pool. My understanding is that it will create X number of threads / tasks and execute 4 (4 core cpu) at a time. But it has to completely Finnish task 1 before task 5 can start.
Hence My 3rd option:
class ParalleAsyncCase
{
public static Queue<string> DataQueue = new Queue<string>();
private static List<string> _destinations = new List<string> { };
/// <summary>
/// Is the main function that is run in its own thread
/// </summary>
private static void Run()
{
while (true)
{
if (DataQueue.Count > 0)
{
string data = DataQueue.Dequeue();
List<Task> tasks = new List<Task>();
foreach (var destination in _destinations)
{
var task = SendDataToDestination(destination, data);
task.Start();
tasks.Add(task);
}
//Wait for all tasks to complete
Task.WaitAll(tasks.ToArray());
}
else
{
Thread.Sleep(1);
}
}
}
private static async Task SendDataToDestination(string destination, string data)
{
//TODO: Send data using http post
await Task.Delay(200);
}
}
Now from my understanding this option, will still block on the main thread at Task.WaitAll(tasks.ToArray());
which is fine because I don't want it to run away with creating tasks faster than they can be executed.
But the tasks that will be execute in parallel should make use of the ThreadPool
, and all X number of tasks should start executing at once, not blocking or in sequential order. (thread pool will swap between them as they become active or are awaiting
)
Now my question.
Does option 3 have any performance benefit over option 2.
Specifically in a higher performance server side scenario. In the specific software I am working on now. There would be multiple instanced of my simple use case above. Ie several consumers.
I'm interested in the theoretical differences and pro's vs cons of the two solutions, and maybe even a better 4th option if there is one.