Given the following settings:
ServicePointManager.DefaultConnectionLimit = 24;
And the following code:
public static async Task<HttpWebResponse> GetResponseAsync(this Uri uri, bool autoRedirect)
{
var request = (HttpWebRequest)WebRequest.Create(uri);
request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.107 Safari/537.36";
request.AllowAutoRedirect = autoRedirect;
request.Timeout = -1;
request.ReadWriteTimeout = -1;
var response = await request.GetResponseAsync();
return (HttpWebResponse)response;
}
public static async Task<PageInfo> GetPageAsync(Uri uri)
{
using (var response = await uri.GetResponseAsync(false))
{
using(var responseStream = response.GetResponseStream())
{
var pageInfo = new PageInfo();
using (var reader = new StreamReader(responseStream))
{
try
{
pageInfo.HTML = await reader.ReadToEndAsync();
}
catch(Exception ex)
{
Console.WriteLine(ex.ToString());
}
return pageInfo;
}
}
}
}
This setup will at 15-20 concurrent web requests, after 1.000 requests throw the following exception:
Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host
The exception is throwen at the line pageInfo.HTML = await reader.ReadToEndAsync()
.
I've tried firing up fiddler, and inspect the statuscode/headers for the url it throws an exception on, when reading from the stream. - And as expected, it's a new url each time - and all returning either 301 or 200. Therefore I can eliminate that it's the host who fails.
Setting the ServicePointManager.DefaultConnectionLimit
to a lower value, helps for some reason. - So do changing the line await reader.ReadToEndAsync()
to reader.ReadToEnd()
.
It seems that some kind of timeout kicks in, that closes the stream before the data is read. - This would also explain why setting the DefaultConnectionLimit
to a lower value, has an impact. The is at best a wild guess, and even if it's true I do not see how to change that timeout. I've set both Timeout
and ReadWriteTimeout
for the WebRequest
(see the GetResponseAsync
extension method above).
Any suggestions/hints are greatly appreciated.