1

How do you increase the timeout value for htmlagiliypack? I'm getting this error alot but I want to increase the timeout limit, or how do you kill the request and try again?

resultingHTML = null;
        try
        {
            string htmlstring = string.Empty;
            HttpWebRequest newwebRequest = (HttpWebRequest)WebRequest.Create(htmlURL);
            HttpWebResponse mywebResponce = (HttpWebResponse)newwebRequest.GetResponse();
            if (mywebResponce.StatusCode == HttpStatusCode.OK)
            {
                Stream ReceiveStream = mywebResponce.GetResponseStream();
                using (StreamReader reader = new StreamReader(ReceiveStream))
                {
                    htmlstring = reader.ReadToEnd();
                }
                HtmlDocument doc = new HtmlDocument();
                doc.Load(htmlstring);
                HtmlWeb hwObject = new HtmlWeb();
                HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");
                resultingHTML = body.InnerHtml.ToString();
            }

        }

2 Answers2

3

I assume you're using HtmlAgility pack to read HTML via a web request here?

I would advise using the framework WebRequest object instead,

http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700

..where you can specify a timeout. You catch timeout (and other connection errors) just by wrapping in a try/catch block.

Then parse the resulting HTML from the WebResponse object via HtmlAgility directly.

Here is an example of how to get the html from the WebResponse

http://msdn.microsoft.com/en-us/library/system.net.webresponse.getresponsestream.aspx

Once you have the html as a string from the WebResponse you would:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);

James Gaunt
  • 14,631
  • 2
  • 39
  • 57
  • correct, do you have an example on how to load it into htmlagility? –  Aug 02 '11 at 17:16
  • is that way not loading the webpage twice? Or am I not looking at it the right way. because once you get the request, you got to load it again –  Aug 02 '11 at 17:23
  • yeah do you have a full example, see my edit i think its wrong –  Aug 02 '11 at 17:39
  • The .LoadHtml takes a string with the actual html, not a link to the page. – James Gaunt Aug 02 '11 at 17:58
0
 HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("wwww.someurl.com");
        httpWebRequest.Timeout = 10000; // 10 second timeout
        using(HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse())
        {
            if (httpWebResponse.StatusCode == HttpStatusCode.OK)
            {
                using(Stream responseStream = httpWebResponse.GetResponseStream())
                {
                    using (StreamReader reader = new StreamReader(responseStream))
                    {
                        var htmlstring = reader.ReadToEnd();
                         HtmlDocument doc = new HtmlDocument();
                         doc.Load(htmlstring);
                    }
                }

            }
        }

I would also look at: Adjusting HttpWebRequest Connection Timeout in C#

Just to understand the difference bettween TimeOut and ReadWriteTimeout on the HttpWebRequest class.

Community
  • 1
  • 1
ElvisLives
  • 2,275
  • 2
  • 18
  • 24
  • I try your answer and program throws exception ''Illegal characters in path' I use url http://www.betstudy.com/predictions/germany/bundesliga/ – Erik Hakobyan Sep 22 '17 at 16:50