13

Im building a program that surf to several websites and do something.

After surfing to like 5 urls successfully, the program hangs after the Application.Run() line. The program doesn't even enter the Handler function and just stuck. the CPU usage is 0 at this point.

I tried closing the threads in any possible way. What i'm doing wrong?

I'm doing it like that:

[STAThread]
private static void Main(string[] args) 
{
    for (int i = 0; i < urls.Count; i++) 
    {
        var th = new Thread(() = > 
        {
            var weBrowser = new WebBrowser();
            weBrowser.AllowNavigation = true;
            weBrowser.DocumentCompleted += Handler;
            weBrowser.Navigate(urls[i]);
            Application.Run();
        });
        th.SetApartmentState(ApartmentState.STA);
        th.Start();
        th.Join();
    }
}

And my Handle function is:

private static void Handler(object sender, WebBrowserDocumentCompletedEventArgs e) 
{
    WebBrowser weBrowser = sender as WebBrowser;
    var htmlDocument = weBrowser.Document;

    /*do something*/

    Application.Exit();
    Application.ExitThread();

    weBrowser.Dispose();
    weBrowser.Stop();

    Thread.CurrentThread.Abort();
}

My problem is very similar to this one: Application.Run() leads to application hanging

There is no answer in this question either.

Thanks!

Community
  • 1
  • 1
BestR
  • 669
  • 2
  • 6
  • 17
  • My approach to this, using `Task` API and `async`/`await`: http://stackoverflow.com/a/22262976/1768303 – noseratio May 19 '15 at 23:19
  • 1
    What are you actually trying to DO? I haven't used WebBrowser before, but a quick glance over the docs suggests that it's basically a full-featured rendering and control engine, and if you don't *need* that much power, you shouldn't be using it. If all you're trying to do is grab source text from a list of pages, the answer can be as simple as throwing `WebRequest` into a `Parallel.For`/`Parallel.ForEach` loop... – Diosjenin May 20 '15 at 16:38

4 Answers4

9

I think you are doing several mistakes:

  • you are joining inside the for look
  • you are calling Application.Exit() in each handler call

You should move the joining outside the for loop and do not call Application.Exit.

The following sample seems to work well:

static class Program
{
  [STAThread]
  static void Main()
  {
     var urls = new List<string>() { 
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com",
        "http://stackoverflow.com"};

     var threads = new Thread[urls.Count];

     for (int i = 0; i < urls.Count; i++)
     {
        threads[i] = new Thread((url) =>
        {
           var weBrowser = new WebBrowser();
           weBrowser.AllowNavigation = true;
           weBrowser.DocumentCompleted += Handler;
           weBrowser.Navigate(url as string);
           Application.Run();
        });
        threads[i].SetApartmentState(ApartmentState.STA);
        threads[i].Start(urls[i]);
     }

     foreach (var t in threads)
        t.Join();

     Application.EnableVisualStyles();
     Application.SetCompatibleTextRenderingDefault(false);
     Application.Run(new Form1());
  }

  private static void Handler(object sender, WebBrowserDocumentCompletedEventArgs e)
  {
     WebBrowser weBrowser = sender as WebBrowser;

     var htmlDocument = weBrowser.Document;

     /*do something*/

     Application.ExitThread();

     weBrowser.Dispose();
     weBrowser.Stop();
  }
}
JJS
  • 6,431
  • 1
  • 54
  • 70
Marius Bancila
  • 16,053
  • 9
  • 49
  • 91
  • Thank for your answer. The program runs now on more sites than before, but after 25-30 sites its stop like before. Do you have a solution for it? – BestR May 08 '15 at 12:02
  • Can you explain what you are actually trying to do? – Marius Bancila May 08 '15 at 12:39
  • Im pulling data foreach (HtmlElement el in htmlDocument.GetElementsByTagName("div")) {....} But it doesn't realy metter. Mayb there is max thread limit? – BestR May 08 '15 at 12:52
  • No, I mean, why are you doing this in `Main`? What are you trying to accomplish? – Marius Bancila May 08 '15 at 13:15
  • Im writing a console application that pull data from some urls.. So it is starting in Main function. – BestR May 08 '15 at 13:39
  • I dont have .net 4.5, But I think it can work. Still, I'm very intrested in why the program just STUCK at Application.Run(), the CPU usage goes to 0 and there is no progress. It's very wierd. What could be the reason? – BestR May 09 '15 at 06:05
0

Usage of urls[i] in your original snippets is wrong. Search C# documentation for closures. You will need to make a local copy before using it.

Furthermore, you should swap weBrowser.Dispose() and weBrowser.Stop(). You can't stop the disposed browser anymore (if Stop is necessary at all).

Finally, don't abort the thread - it will finish itself.

Rahul Gupta
  • 46,769
  • 10
  • 112
  • 126
JeffRSon
  • 10,404
  • 4
  • 26
  • 51
  • I tried everything to close the thread-with or without weBrowser.* and it didnt work. If I dont abort the thread it doesnt close it self(program wait at join). You know another ways to close threads? – BestR May 17 '15 at 18:34
  • Set a breakpoint after Application.Run() - you will see that it is triggered after Application.ExitThread is executed. If you abort the thread, weBrowser.Dispose is not called. – JeffRSon May 18 '15 at 07:50
  • The problem is that the program hangs at Application.Run() after few threads created. the threads are closing, but new arn't created at some point. – BestR May 18 '15 at 16:36
  • Please test the breakpoint with only one thread. You should see how the thread is finished after that. Also, confirm that weBrowser.Dispose is executed. Finally, post a minimal, complete example that demonstrates the problem. BTW, what dou you mean by "the program hangs at Application.Run" - Application.Run is always blocking... – JeffRSon May 18 '15 at 19:30
0

You may be running into the maximum number of concurrent connections for the WebBrowser. By explicitly setting this to a higher number, you can have additional streams reading through the browser at once.

// Example Usage:
ServicePointManager.DefaultConnectionLimit = 10;

Keep in mind that there is a performance hit by increasing this number above the default (I think it is 4) as you will have much more network traffic that needs processed.

See the MSDN article for ConnectionLimit for more information.

Martin Noreke
  • 4,066
  • 22
  • 34
0

I don't understand what would you like to achieve with Application.Run inside for loop.

Why are you using WebBrowser component? If you are just parsing web page it is better to use

string urlAddress = "http://stackoverflow.com"; 
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(urlAddress);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
if (response.StatusCode == HttpStatusCode.OK)
{
    StreamReader reader= null;
    if (response.CharacterSet == null)
        reader = new StreamReader(response.GetResponseStream());
    else
        reader = new StreamReader(response.GetResponseStream(), Encoding.GetEncoding(response.CharacterSet));
    string data = reader.ReadToEnd();
    response.Close();
    reader.Close();
}

or

using (WebClient client = new WebClient())
{
    string html = client.DownloadString("http://stackoverflow.com");
}

For parsing html look at Html Agility Pack or something similar.

If this is console application you don't need to call Application.Run(), otherwise you should consider showing splash screen with progress to give user some feedback.

Kodre
  • 181
  • 6