1

We have an application (a meta-search engine) that must make 50 - 250 outbound HTTP connections in response to a user action, frequently.

The way we do this is by creating a bunch of HttpWebRequests and running them asynchronously using Action.BeginInvoke. This obviously uses the ThreadPool to launch the web requests, which run synchronously on their own thread. Note that it is currently this way as this was originally a .NET 2.0 app and there was no TPL to speak of.

Using ETW (our event sources combined with the .NET framework and kernal ones) and NetMon is that while the thread pool can start 200 threads running our code in about 300ms (so, no threadpool exhaustion issues here), it takes up a variable amount of time, sometimes up to 10 - 15 seconds for the Windows kernel to make all the TCP connections that have been queued up.

This is very obvious in NetMon - you see around 60 - 100 TCP connections open (SYN) immediately (the number varies, but it's never more then around 120), then the rest trickle in over a period of time. It's as if the connections are being queued somewhere, but I don't know where and I don't know how to tune this to we can perform more concurrent outgoing connections. Perfmon Outbound Connection Queue stays at 0 but in the Connections Established counter you can see an initial spike of connections then a gradual increase as the rest filter through.

It does appear that latency to the endpoints to which we are connecting play a part, as running the code close to the endpoints that it connects to doesn't show the problem as significantly.

I've taken comprehensive ETW traces but there is no decent documentation on many of the Microsoft providers, which would be a help I'm sure.

Any advice to work around this or advice on tuning windows for a large amount of outgoing connections would be great. The platform is Win7 (dev) and Win2k8R2 (prod).

danielgo
  • 11
  • 2
  • Could it be anything related to this? http://redmad.com/windows/iis-6-increase-outbound-tcp-connections/ Or maybe: http://stackoverflow.com/questions/1536550/maximum-number-of-concurrent-tcp-ip-connections-win-xp-sp3 – Steve May 21 '13 at 01:22
  • No, we're not running out of ephemeral ports - we're not really creating that many connections, but they are slow to get going. And NetMon and ETW are both showing that the later connections aren't re-using ports but are still getting their own, new ports. That said, I am investigating the TCB settings as there is some stuff around that in the ETW traces. – danielgo May 21 '13 at 01:39

1 Answers1

0

It looks like slow DNS queries are the culprit here. Looking at the ETW provider "Microsoft-Windows-Networking-Correlation", I can trace the network call from inception to connection and note that many connections are taking > 1 second at the DNS resolver (Microsoft-Windows-RPC).

It appears our local DNS server is slow/can't handle the load we are throwing at it and isn't caching aggressively. Production wasn't showing as severe symptoms as the prod DNS servers do everything right.

danielgo
  • 11
  • 2