2

Using the following code:

static void Main(string[] args)
{
    ServicePointManager.DefaultConnectionLimit = 1000;

    var client = new HttpClient();
    var uris = File.ReadAllLines(@"C:\urls.txt").Select(x => new Uri(x));

    foreach(var uri in uris)
    {
        var url = uri.ToString();

        var task = client.GetStringAsync(uri);
        task.ContinueWith(t => Console.WriteLine("Done {0}", url), TaskContinuationOptions.OnlyOnRanToCompletion);
        task.ContinueWith(t => Console.WriteLine("Failed {0}", url), TaskContinuationOptions.OnlyOnFaulted);
        task.ContinueWith(t => Console.WriteLine("Cancelled {0}", url), TaskContinuationOptions.OnlyOnCanceled);
    }

    Console.ReadKey();
}

I can at best request 15-20 urls concurrently, according to fiddler. All of these urls are unique and not pointing to the same host.

What is going on?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
ebb
  • 9,297
  • 18
  • 72
  • 123
  • I believe I remember reading somewhere that the Task Library under IIS uses threads from the AppPool (which could potentially lead to thread starvation and 500s), where as `new Thread()` does not. I'll look for this info. – Erik Philips Feb 21 '14 at 19:39
  • [Do asynchronous operations in ASP.NET MVC use a thread from ThreadPool on .NET 4](http://stackoverflow.com/questions/8743067/do-asynchronous-operations-in-asp-net-mvc-use-a-thread-from-threadpool-on-net-4) is probably the best read on why you may be having problems. – Erik Philips Feb 21 '14 at 19:46
  • @ErikPhilips, Thanks for the link - but I cannot see, how that has anything to do with my problem. – ebb Feb 21 '14 at 19:56
  • What happens if you remove all your `.ContinueWith()` blocking operations? – Erik Philips Feb 21 '14 at 20:06
  • @ErikPhilips, Still only 15-20 concurrent requests - also they're not blocking. They're invoked whenever the IOCP from the async I/O operation signals completion. – ebb Feb 21 '14 at 20:12
  • @ebb, I suggest you read the section of the [IOCP doc on threads and concurrency](http://msdn.microsoft.com/en-us/library/windows/desktop/aa365198(v=vs.85).aspx). It very clearly states exactly what I've explained to you in my answer below. – pwnyexpress Feb 21 '14 at 20:30

1 Answers1

1

How many cores does the CPU on the machine your running this on have? There are limits to how many concurrent operations your machine can handle. Also, TPL automatically decides the right amount of parallelism to invoke given the task at hand. It is not always more efficient to spin up 1000 threads to accomplish a task. There's substantial overheard managing messages being passed between threads.

This may not have any performance improvements, but this should be more idiomatic for parallelism:

static void Main(string[] args)
{
    ServicePointManager.DefaultConnectionLimit = 1000;

    var uris = File.ReadAllLines(@"C:\urls.txt").Select(x => new Uri(x));

    foreach(var uri in uris)
    {
        var client = new HttpClient();
        var url = uri.ToString();

        var task = client.GetStringAsync(uri);
        task.ContinueWith(t => Console.WriteLine("Done {0}", url), TaskContinuationOptions.OnlyOnRanToCompletion);
        task.ContinueWith(t => Console.WriteLine("Failed {0}", url), TaskContinuationOptions.OnlyOnFaulted);
        task.ContinueWith(t => Console.WriteLine("Cancelled {0}", url), TaskContinuationOptions.OnlyOnCanceled);
    }

    Console.ReadKey();
}

or even:

static void Main(string[] args)
{
    ServicePointManager.DefaultConnectionLimit = 1000;

    var uris = File.ReadAllLines(@"C:\urls.txt").Select(x => new Uri(x));

    Parallel.ForEach(uris, uri => {
        WebRequest myRequest = WebRequest.Create(uri.toString());
        // handle response synchronously
    });

}
pwnyexpress
  • 1,016
  • 7
  • 14
  • My CPU has 2 cores. - But making async web requests should not take up any CPU time - only when calling the callback (which is the `ContinueWith`). – ebb Feb 21 '14 at 19:35
  • CPU time doesn't matter. TPL is managing a buffered thread pool for you and keeps track of what threads are active and which have completed and can be returned to the pool. The size of those pools in a function of the amount of CPU's the machine has available among other variables that are hidden from you by TPL. If you seek more granular control over how threads are managed then you need to create your own thread pool and manage it. – pwnyexpress Feb 21 '14 at 19:42
  • Calling the async get, will register an IO Completion Port and continue the loop. Whenever the web request has been processed, the IOCP is signaled - and now a thread from the threadpool is used to execute the callback. - I really doubt that the callback (a simple `Console.WriteLine` in my case) would take up more than a few threads. – ebb Feb 21 '14 at 19:54
  • You consume a thread for every web request. There would be no other way for the request to get executed unless it consumed some CPU resource. Also, what thread is then passing the message back to the calling thread that the execution has finished and the callback can be fired? – pwnyexpress Feb 21 '14 at 20:04
  • @ebb: There are other things going on, as well. DNS resolution, for example, can take significant time. Sometimes a few milliseconds, sometimes a full second or more. – Jim Mischel Feb 21 '14 at 21:09