0

I'm using a Timer to determine if a page loaded using AJAX is ready and return from function only when it is ready (page load including ajax stuff was loaded). I was trying the code like the below but I figured out Application.DoEvents() only processes pending Windows message loop items so it run into an infinity loop because Applicadion.DoEvents() doesn't raise any WebBrowser's events (but only Windows one as I mentioned) so ReadyState can't be update and never change.

My question is: Is there any way to force WebBrowser's events Application.DoEvents() does?

static bool done = false;
int[] foo()
{
  int[] data;
 timer1.Interval = 1000;
            timer1.Tick += new EventHandler(delegate(object o, EventArgs ea)
                {
                    if (browser.ReadyState == WebBrowserReadyState.Complete)
                    {
                        timer1.Stop();
                        data = extract_data();
                        done = true;
                    }
                });
            timer1.Start();

            while(!done) /* return from function only if timer1 isn't running anymore */
            {
                Application.DoEvents();
                Thread.Sleep(1000);
            }

            return data;
}

I'm aware about Application.DoEvents() "issues" but I can't find any other way to do that. An different approach to solve that is very welcome too.

Community
  • 1
  • 1
Jack
  • 16,276
  • 55
  • 159
  • 284

2 Answers2

1

If you are using .NET 4.5 or newer (4.0 if you are willing to use the Microsoft.Bcl.Async library) this can be easily done via a TaskCompletionSource and an await

async Task<int[]> foo()
{
    //Create the completion source and the callback delegate.
    var tcs = new TaskCompletionSource<object>();
    WebBrowserDocumentCompletedEventHandler callback = (sender, args) => tcs.SetResult(null);

    //Subscribe to the Document completed event and run the callback.
    browser.DocumentCompleted += callback;

    try
    {
        //We may already be in the complete state so the event will never fire.
        //Therefor if we are in the completed state we can skip the await.
        if (browser.ReadyState != WebBrowserReadyState.Complete)
        {
            //Wait here for the completed event to fire.
            await tcs.Task;
        }
    }
    finally
    {
        //Unsubscribe the callback which is nolonger needed.
        browser.DocumentCompleted -= callback;
    }

    //Process the data from the completed document.
    var data = extract_data();
    return data;
}

What this code will do is subscribe to the DocumentCompleted event then will optionally wait for the document to finish if it has not finished loading yet, while it is waiting it returned control to the caller (the same effect as your DoEvents loop, but much better) once the event fires it processes the data and returns the result.

However, if possible, a even better solution is re-write your code to never call foo at all and just subscribe to the DocumentCompleted event and have that push the data to where it needs to go instead of pulling it.

Scott Chamberlain
  • 124,994
  • 33
  • 282
  • 431
  • First, thanks! it's something I've been working on it for a while. This really worked how I was looking for. But somehow, unlike if I call extract_data() from Timer's tick event, the expected div (which was created after ajax call) is null. As if the load wasn't complete. But as I mentioned, it Works from Timer's tick event (like in my code in the question). Any idea why that behavior? does it has to do with threads? – Jack Nov 17 '14 at 01:24
  • What I mean with expected div is `null`: consider `var p = browser.Document.GetElementById("foo");` (declared in `extract_data()`) inside Timer's tick event `p` **isn't null** (so I can extract the values) but using your solution `p` **is null**. I have *no idea* why. – Jack Nov 17 '14 at 01:33
  • 1
    Forget all previously comments. I just combined that approach with http://stackoverflow.com/questions/20930414/how-to-dynamically-generate-html-code-using-nets-webbrowser-or-mshtml-htmldocu/20934538#20934538 and it is working fine – Jack Nov 17 '14 at 02:15
1

In Visual Studio double-click the WebBrowser control. That will create an event handler for the DocumentCompleted event. You can use whatever other mechanism you want to for creating the DocumentCompleted event handler but that event is the important part. See my article Introduction to Web Site Scraping for a sample.

Please do not use Application.DoEvents(), ReadyState or Thread.Sleep for this.

The problem can be complicated to solve if the web page uses a script to generate portions of the page. If that happens then I would do everything I could to avoid using Thread.Sleep but you might have to.

Sam Hobbs
  • 2,594
  • 3
  • 21
  • 32
  • That page isn't avaliable. Using `DocumentCompleted` isn't going to work because this event fires when HTML load is done but I need when AJAX is done too. That's why the `Timer`. I'm aware about `Application.DoEvents()` issues but in this scenary I didn't find anything better. – Jack Nov 17 '14 at 01:29
  • You say "DocumentCompleted isn't going to work" but I hope you understand that if the other answer works then DocumentCompleted works. I hope you understand that. – Sam Hobbs Nov 17 '14 at 02:48
  • It isn't really in the way I want to because I want to process the page when everything, including AJAX, is ready and not only HTML like `DocumentCompleted` does. I combined the answer with this approach and it's just working fine http://stackoverflow.com/questions/20930414/how-to-dynamically-generate-html-code-using-nets-webbrowser-or-mshtml-htmldocu/20934538#20934538 – Jack Nov 17 '14 at 15:36