0

I have the following WebBrowser code which is working fine after loading 1000+ page of another domain to extract data.

objWebBrowser = new WebBrowser();

// Disable any warning/error prompts
objWebBrowser.ScriptErrorsSuppressed = true;

objWebBrowser.Navigate(new Uri(strFullSourceParamUrl));
while (objWebBrowser.ReadyState != WebBrowserReadyState.Complete)
{
    Application.DoEvents();

    Thread.Sleep(50);
}
objWebBrowser.DocumentCompleted += Wb_DocumentCompleted;

var webScrapperResult = objWebBrowser.Document.GetElementsByTagName("HTML")[0].OuterHtml;

Now I'm trying to load another domain https://nutritiondata.self.com/facts/vegetables-and-vegetable-products/1/0. Last week it was loading fine, but few days ago the ReadyState is always "Interactive" no matter how long it loop. I tried set ScriptErrorsSuppressed to false, there were few dialog boxes appeared indicate some js file loading error. I clicked Yes for all of them and then the program just keep loading without hitting debugger.

Any advice how to resolve this type of issue?

Koo SengSeng
  • 933
  • 3
  • 12
  • 31
  • You need to change a lot of things. First, you have to enabled the Emulation and GpuRendering features: [How can I get the WebBrowser control to show modern contents?](https://stackoverflow.com/a/38514446/7444103). Then, you have to read and follow the pattern described here: [How to get an HtmlElement value inside Frames/IFrames?](https://stackoverflow.com/a/53218064/7444103). The, use the WebBrowser.Navigate() overload that allows to set Header, where you specify a different type of Browser, e.g., `User-Agent: Mozilla/5.0 (Windows NT 10; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0`. – Jimi Jan 24 '21 at 05:26
  • Now the WebPage will open and you can navigate to it. BUT, since you don't have cookies that register the privacy agreement, a dialog is shown, waiting for the User intervention. You can use standard means to accept it, in code. This happens with other web sites, so build a procedure that can interact with that *This Site uses Cookies. Will You Accept The Cookies?* very useful dialog (you can parse the Html anyway). -- You HAVE to get rid of that procedure based on `Application.DoEvents();`, that's very bad. -- There's a reason for a not so recent WebBrowser version set as the `User-Agent`. – Jimi Jan 24 '21 at 05:34

0 Answers0