0

I am playing and learning through async and parallel programming. I have a list of addresses and want to dns resolve them. Furthermore, I have made this function for that:

private static Task<string> ResolveAsync(string ipAddress)
{
    return Task.Run(() => Dns.GetHostEntry(ipAddress).HostName);
}

Now, in the program I am resolving addresses like this, the idea is to use parallel programming:

//getting orderedClientIps before

var taskArray = new List<Task>();
foreach (var orderedClientIp in orderedClientIps)
{
    var task = new Task(async () =>
    {
        orderedClientIp.Address = await ResolveAsync(orderedClientIp.Ip);
    });
    taskArray.Add(task);
    task.Start();
}

Task.WaitAll(taskArray.ToArray());

foreach (var orderedClientIp in orderedClientIps)
{
    Console.WriteLine($"{(orderedClientIp.Ip)} ({orderedClientIp.Ip}) - {orderedClientIp.Count}");
}

So, here we wait for all the addresses to resolve, and then in a separate iteration print them.

What interests me, what would be the difference if instead of printing in separate iteration, I would do something like this:

foreach (var orderedClientIp in orderedClientIps)
{
    var task = new Task(async () =>
    {
        orderedClientIp.Address = await ResolveAsync(orderedClientIp.Ip);
        Console.WriteLine($"{(orderedClientIp.Ip)} ({orderedClientIp.Ip}) - {orderedClientIp.Count}");
    });
    taskArray.Add(task);
    task.Start();
}
Task.WaitAll(taskArray.ToArray());

I have tried executing, and it writes to console one by one, whereas in the first instance writes them all out after waiting them. I think that the first approach is parallel and better, but am not quite sure of the differences. What is, in the context of async and parallel programming, different in the second approach? And, does the second approach somehow violates Task.WaitAll() line.

Florian
  • 1,019
  • 6
  • 22
skyhunter96
  • 135
  • 2
  • 2
  • 13
  • In my opinion some good discussion here: https://stackoverflow.com/questions/3670057/does-console-writeline-block – Mikael Suokas Dec 22 '22 at 15:59
  • 2
    Side notes: 1) `GetHostEntryAsync` exists. 2) Don't wrap synchronous code in `Task.Run` in an attempt to "make it asynchronous". 3) Don't ever use the `Task` constructor. 4) Don't confuse asynchronous and parallel programming. – Stephen Cleary Dec 22 '22 at 16:05

2 Answers2

4

The difference in the output behaviour you see is simply related to the point in time where you write the output.

Second approach: "and it writes to console one by one"

That's because the code to write the output is called as soon as any task is "done". That happens at different point in time and thus you see them being output "one by one".

First approach: "in the first instance writes them all out after waiting them."

That's because you do just that in your code. Wait until all is done and then output sequentially what you have found.

Your example cannot be judged by the behaviour of the output regarding which version is better in running things parallel.

In fact, for all practical purposes they are identical. The overhead of Console.WriteLine inside the task (compared to doing the actual DNS-lookup) should be neglectable.

It would be different for compute intensive things, but then you should probably be using Parallel.ForEach anyway.

So where should you output then? It depends. If you need to show the information (here the DNS-lookup result) as soon as possible, then do it from inside the Task. If it can wait until all is done (which might take some time), then do it at the end.

Christian.K
  • 47,778
  • 10
  • 99
  • 143
  • Initially, these ips I am getting from iis log files, which can contain thousands of addresses, does it make a difference then, in first vs second approach? It doesn't matter how and when information is showed, what matters to me is understanding how does writeline inside first foreach affects it generally (not neceserally performance wise, but in terms of parallel programming and the case with large number of addresses. Does the writeline inside makes the tasks not run parallel? The thing is I am just starting with this type of programming but this case seemed interesting and boggles me. – skyhunter96 Dec 22 '22 at 16:31
  • Actually I give it some thought and also did some learning/research and this answer sums it up perfectly. – skyhunter96 Dec 26 '22 at 03:12
0

The write to the console is not asyncron. because the console is per default not async. With the Console part you "syncronize" your tasks. Maybe:

var task = new Task(async () =>
    {
        orderedClientIp.Address = await ResolveAsync(orderedClientIp.Ip);
        return $"{(orderedClientIp.Ip)} ({orderedClientIp.Ip}) - {orderedClientIp.Count}";
    }).ContinueWith(previousTask => Console.WriteLine(previousTask.Result));
tire0011
  • 1,048
  • 1
  • 11
  • 22
  • Care to elaborate? Does the writeline makes the tasks not run parallel? – skyhunter96 Dec 22 '22 at 16:33
  • I would not explicitly say it "synchronizes" the task, in the answer beyond it is solidly explained. All it's doing is writing when the task is finished, granted, this might take time and of course it will wait for the console to write, but it's so fast it might be considered negligible. – skyhunter96 Dec 26 '22 at 03:11