3

I am having a problem with threading from external class. I am quite new in threading so many things still remain mystery to me so keep that in mind.

I did my research and found many topics about it including:

And it seems pretty clear but still doesn't help me. Here is my code:

    public DownloadContent()
    {
        adres = @"...";

        wb = new WebBrowser();
        wb.Navigating += (object sender, WebBrowserNavigatingEventArgs e) => objWait.WaitOne();
        wb.DocumentCompleted += (object sender, WebBrowserDocumentCompletedEventArgs e) => objWait.Set(); //Here is the problem
        wb.DocumentCompleted += OnDocumentCompleted; 
        wb.Navigate(adres);

        MessageBox.Show("after"); //should print after OnDocumentCompleted
    }


    private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        //some logic
    }

Problem is that this WebBrowser class is using separate thread to navigate and complete document. There is nothing wrong with that but I don't know how my main thread is suppose to communicate with it. I was trying to make original thread wait but there is a problem that function that is suppose to start it again objWait.Set() is called by main thread which is currently frozen. I assume that's the real problem. I have tried many strange ways to make it work:

  • making another thread for wb.Navigate(...); It didn't work because it cannot work on single thread;
  • making separate thread for just objWait.Set(); Didn't work either, not sure why;
  • and some even weirder things.

I know it may be trivial for some but I have stuck with it for hours now and I really don't know what to do. So I will be grateful for any help.

*******************************************EDIT*******************************************

Thank you everyone for answers. I see many people have noticed what was my original issue and gave me some advice for which I am grateful, you made my work easier. That being said the nature of this quest was to find out if there is any good way of dealing with it. Anyway thank you for all your advice and directions I will look into them closer once I am finished with this little project (one thing at the time).

I guess I could do just something like that:

    public DownloadContent()
    {
        ...
        bool flag = true;
        wb.DocumentCompleted += OnDocumentCompleted; 
        wb.DocumentCompleted += (object sender, WebBrowserDocumentCompletedEventArgs e) => flag = false;

        wb.Navigate(adres);

        while(flag);
        MessageBox.Show("after"); //should print after OnDocumentCompleted
    }

But I don't know if this is considered valid or elegant solution. I would be grateful for any thoughts on that. Thank you in advance.

Community
  • 1
  • 1
Bielik
  • 922
  • 2
  • 14
  • 25
  • Are you trying to just download the HTML for a web page or are your trying to render the page for printing or something like that? – Enigmativity Dec 15 '15 at 00:48
  • I want to get some information from the website. I was following first approach from this guide: [link](http://www.codeproject.com/Tips/858775/Csharp-Website-HTML-Content-Parsing-or-How-To-Get) I was using wb because I need access to

    elements.
    – Bielik Dec 15 '15 at 08:54
  • The article also shows the use of `WebClient` and "HtmlAgilityPack". That's the way to go if you need to access the HTML source code and extract data. The code is single-threaded and very fast if you go that way. – Enigmativity Dec 15 '15 at 09:01
  • 1
    WebBrowser cannot work on any thread, you have to create a special one. A thread that is STA and implements the STA contract. Which requires pumping a message loop (Application.Run) and must never block. Violating the STA contract causes deadlock and prevents WebBrowser from raising any events, the kind of trouble you've discovered. Once you have an STA thread then getting it to do stuff from your main thread becomes very easy. Sample code [is here](http://stackoverflow.com/a/21684059/17034). – Hans Passant Dec 15 '15 at 10:39

2 Answers2

0

How about something like this:

public DownloadContent()
    {

        AutoResetEvent ase = new AutoResetEvent(false);

        System.Threading.Tasks.Task.Factory.StartNew(()=>
{
        adres = @"...";

        wb = new WebBrowser();

        wb.DocumentCompleted += (object sender, WebBrowserDocumentCompletedEventArgs e) => ase.Set(); //Here is the problem

        wb.Navigate(adres);
});

ase.WaitOne();
        MessageBox.Show("after"); //should print after OnDocumentCompleted
    }

We start the download in a different thread, and block the current thread until the document is complete.

Daniel James Bryars
  • 4,429
  • 3
  • 39
  • 57
  • It seems to have the same problem I had when I was presenting my approach. I don't know why I got debug message in Polish but I will try to translate it: ThreadStateException... "Cannot create appearance(?) of ActiveX format '8856f961-340a-11d0-a96b-00c04fd705a2' because current thread doesn't work in single thread apartment(?)". I placed (?) next to words I though maybe translated wrong. – Bielik Dec 15 '15 at 09:05
  • That error is saying the "WebBrowser" needs to be created from a certain type of "thread". Threads come in different flavours (eg STA, MTA). As per the comments above, I think you are making this more complicated then you need to. Try asking a different question which describes what you want to achieve. – Daniel James Bryars Dec 15 '15 at 22:38
0

This isn't a direct answer to your question, but as per your comment saying that all you're after is extract text between a pair of <p> tags then you're making your job hard.

The far faster, easier, and single-threaded way to do what you want is to use WebClient with "HtmlAgilityPack". Here's how:

using (var wc = new System.Net.WebClient())
{
    var html = wc.DownloadString(@"http://www.microsoft.com");
    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(html);
    var node = doc.DocumentNode.SelectSingleNode("/html/body/p");
    Console.WriteLine(node.InnerText);
}

That currently produces this result:

Your current User-Agent string appears to be from an automated process, if this is incorrect, please click this link:United States English Microsoft Homepage

Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • Thank you for reply. True it doesn't answer a question but gives me some nice tips. Actually I want multiple nodes but I guess there is a method for that too. My original approach was to play with that but I came across these threading I started digging into that. This is purely academic program for my personal learning purposes so I am still curious how can I manage this external thread. Yes in this example I can use what you just showed me (and for which I am grateful) but some day I may need to work with such a issue again so I would like to know if there is a solution to this problem. – Bielik Dec 15 '15 at 10:01
  • 2
    @Bielik - The solution to threading is often not to use threading. Often it will actually make things slower or far too complicated. The better alternative is to use abstractions like TPL or Reactive Extensions to give you multi-threading but without the headaches. – Enigmativity Dec 15 '15 at 11:26