2

We have an asp.net webapi application that needs to issue a lot of calls to other web applications (it's basically a reverse proxy). To do this we use the async methods of the HttpClient.

Yes, we have seen the hints about using only one HttpClient instance and not to dispose of it.

Yes, we have seen the hints about setting configuration values, especially the problem with the lease timeout. Currently we set ConnectionLimit = CPU*12, ConnectionLeaseTimeout = 5min and MaxIdleTime = 30s.

We can see that the connections behave as desired. The throughput in a load test was also very good. However we are facing issues where occasionally the connections stop working. It seems to happen when a lot of requests are coming in (and, being a reverse proxy, cause new requests to be issued) and it happens mostly (but not only) with the slowest of all backend applications. The behaviour is then that it takes forever to finish the requests to this endpoint or they simply end in a timeout.

An IISReset of the server hosting our reverse proxy application terminates the problems (for a while).

We have investigated in several areas already:

  • Performance issues of the remote web application: Although it behaves exactly as this would be the case the performance is good when the same requests are issued locally on the remote server. Also the values for CPU / network etc. are low.
  • Network issues (bandwidth, router, firewall, load balancers): Possible but rather unlikely since everything else runs stable and our hoster is involved in the analysis too.
  • Threadpool starvation: Not impossible but rather theoretical - sure we have a lot of async calls but shouldn't that help regarding this issue?
  • HttpCompletionOption.ResponseHeadersRead: Not a problem by itself but maybe one piece of the puzzle?

The best explanation so far focuses on the ConnectionLimit: We started setting the values mentioned above only recently and this seems to have triggered the problems. But why would it? Shouldn't it be an improvement to reuse the connections instead of opening a new one for every request? And the values we set seem to be rather conservative?

We have started to experiment with these values lately to see their impact in production. Yet it is still unclear to us if this is the only cause. And we'd appreciate a more straighforward approach for analysis. Unfortunately a memory dump and netstat printouts did not help any further.

Some suggestions about how to analyze or hints about possible causes would be highly appreciated.

***** EDIT *****

Setting the connection limit to 1000 is solving the issue! So the question remains as to why is that the case? From what we know the default connection limit is 2 in a non-web and 1000 in a web application. MS is suggesting a default value of CPU*12 (but they didn't implement it like that?!) so our change was basically to go from 1000 to 48. Still we can see that only a handful connections are open. Is there anyone who can shed some light on this? What is the exact behaviour about opening new connections, reusing existing ones, pipelining etc.? Is there any source of information for this?

srudin
  • 261
  • 2
  • 13
  • You have "seen the hints of using only one HttpClient". Does this mean you are now using only one HttpClient, or did you read about it and decide you didn't need to do it? Threadpool starvation is not only theoretical when you have many request. Are your API methods also async or do you use .Result a lot? – TheHvidsten Oct 10 '18 at 12:07
  • We follow the recommandations and therefore have only 1 static HttpClient instance. – srudin Oct 11 '18 at 03:41
  • All of our calls are "pure" async meaning that we use async/await only and never .Result, .Wait or alike. We therefore expect that a call to HttpClient.SendAsync should release the running thread back to the threadpool so starvation is unlikely to occurr. In fact we have seen starvation happening when both the reverse proxy and the remote application were hosted under the same AppPool (dev machine). Hosting them in 2 AppPools or even 2 servers (as is the case for the situation described here) has not shown the same behaviour so far and, according to our understanding, is not expected. – srudin Oct 11 '18 at 03:47
  • Just a thought: maybe you could have one client for dealing with calls to the slowest system and use it in a sequential manner? It might be less likely to tie up resources for the calls to other "better behaving" systems? – mortb Oct 11 '18 at 10:19

2 Answers2

0

ConnectionLimit means ServicePointManager.DefaultConnectionLimit? Yes it matters. When the value is X, if there are already X requests waiting response, new request will not be sent until any previous request is finished.

skyoxZ
  • 362
  • 1
  • 3
  • 10
  • Well, there is SPM.DefaultConnectionLimit and SP.ConnectionLimit, the latter overriding the first. We set the latter only. – srudin Oct 11 '18 at 03:49
  • "new request will not be sent until any previous request is finished" We're not too sure about that - We think it depends on whether pipelining is on or off, right? We are not sending KeepAlive with our requests so according to the spec the default should be true. In fact we can see in netstat that when setting the connection limit to 1 there is only one port being opened and all the traffic goes over this one. Obviously the requests need to be sent sequentially but we assume they are sent before the response of the previous request is received. Can a deadlock / congestion happen here? – srudin Oct 11 '18 at 04:00
  • @srudin Build a web api: 1. Record the timestamp of receiving request. 2. Sleep 3 seconds. 3. Log the timestamps of receiving request and sending response. 4. Send response. -And, test a .NET client with different ConnectionLimits visiting the web api. – skyoxZ Oct 11 '18 at 05:49
  • From our understanding this will show us whether pipelining works or not. This may be interesting but how does it help us solve our problem? In the end we don't care too much about pipelining but we care about the deadlock / congestion / starvation. – srudin Oct 11 '18 at 09:53
  • @srudin I don't think pipelining or reusing connection relative to ConnectionLimit. ConnectionLimit is a limit to the program (from OS, maybe). My test should prove that `when ConnectionLimit is X, if there are already X requests waiting response, new request will not be sent until any previous request is finished`. – skyoxZ Oct 11 '18 at 11:03
0

I posted a follow up question here: How to disable pipelining for the .NET HttpClient

Unfortunately there were no real answers to any of my questions. We ended up leaving the ConnectionLimit at 1000 (which is a workaround only but the only solution we were able to find).

srudin
  • 261
  • 2
  • 13