20

I'm writing an application that uses the WebBrowser control to view web content that can change with AJAX that adds new content/elements. I can't seem to get at the new elements any way I've tried. BrowserCtl.DocumentText doesn't have the up-to-date page and of course it's not in "view source" either.

Is there some way to get this new data using this control? :( Please help. Thanks!

IE:

Browser.Navigate("www.somewebpagewithAJAX.com");
//Code that waits for browser to finish...
...
//WebBrowser control has loaded content and AJAX has loaded new content
// (is visible at runtime on form) but can't see them in Browser.Document.All
// or Browser.DocumentText :(
Andy
  • 30,088
  • 6
  • 78
  • 89
aikeru
  • 3,773
  • 3
  • 33
  • 48

7 Answers7

20

I solved the problem for me.

the key is, attaching a handler for onPropertyChanged event of the div element which is being populated via ajax call.

HtmlElement target = webBrowser.Document.GetElementById("div_populated_by_ajax");

if (target != null)
{
      target.AttachEventHandler("onpropertychange", handler);
}

and finally,

private void handler(Object sender, EventArgs e)
{
      HtmlElement div = webBrowser.Document.GetElementById("div_populated_by_ajax");
      if (div == null) return;
      String contentLoaded = div.InnerHtml; // get the content loaded via ajax
}
BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
  • hi i have been trying to do this with youtube comments, but i cant figure out how this will work. – Alok Nov 24 '13 at 05:41
5
using System;
using System.Windows.Forms;

namespace WebBrowserDemo
{
    class Program
    {
        public const string TestUrl = "http://www.w3schools.com/Ajax/tryit_view.asp?filename=tryajax_first";

        [STAThread]
        static void Main(string[] args)
        {
            WebBrowser wb = new WebBrowser();
            wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
            wb.Navigate(TestUrl);

            while (wb.ReadyState != WebBrowserReadyState.Complete)
            {
                Application.DoEvents();
            }

            Console.WriteLine("\nPress any key to continue...");
            Console.ReadKey(true);
        }

        static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            WebBrowser wb = (WebBrowser)sender;

            HtmlElement document = wb.Document.GetElementsByTagName("html")[0];
            HtmlElement button = wb.Document.GetElementsByTagName("button")[0];

            Console.WriteLine(document.OuterHtml + "\n");

            button.InvokeMember("Click");

            Console.WriteLine(document.OuterHtml);           
        }
    }
}
xian
  • 4,657
  • 5
  • 34
  • 38
3

You will need to use DOM for it. Cast WebBrowser.Document.DomDocument to IHTMLDocument?. You will have to import some COM interfaces or Microsoft.mshtml assembly.

Have a look to http://msdn.microsoft.com/en-us/library/aa752641(VS.85).aspx for more details.

Eugene Petrenko
  • 4,874
  • 27
  • 36
  • Ouch! I'd like to avoid this if possible, I think. I'm fine working with HtmlElement.DomElement COM types but would the IHTMLDocument have the now-changed elements post-javascript? – aikeru Mar 31 '09 at 15:06
  • I am trying to do this too, did you ever find a solution? – TheGateKeeper Mar 13 '12 at 12:28
2

I assume that since you're reading content which is generated from Ajax requests that you require the user to progress the application to a point where the relevant data is loaded, at which point you run code to read the data.

If that's not the case, you'll need to automate this process, generating the click events which build out the DOM nodes you're interested in reading. I do this somewhat commonly with the WebBrowser control and tend to write that layer of functionality in Javascript and call it with .InvokeScript(). Another route would be to find the nodes which fire the Ajax functionality from C# and manually trigger their click events:

HtmlElement content = webMain.Document.GetElementById("content");
content.RaiseEvent("onclick");

An important aspect to note in the script above is the fact that you can interact with DOM nodes naively in C# if you accept and work around the limitations of the HtmlElement object type.

John Lewin
  • 6,050
  • 4
  • 22
  • 20
  • Thanks for the informative post, unfortunately my problem lies in that once the JavaScript has already run, there are new elements on the page that I need to interact with or check a value from ... the Document.x doesn't seem to have these new elements post-javascript :( – aikeru Mar 31 '09 at 15:08
  • The .Document reference provides live access to the DOM & elements created after initial load are just as accessible as the original ones. Is there any chance that the generated elements exist in a frame? Can you share the page which present the problem? – John Lewin Mar 31 '09 at 23:22
0

How about running javascript to caption the element and displaying it in a new window?

I haven't tested it out but it may work.

(WebBrowser)w.Navigate("javascript:GetElementById('div').innerHtml;", true);

The true attribute to open the return in a new windows. (Or a frame or maybe you can find a better way)

To capture the NewWindow event you'll need to reference the SHDocVw.dll which is in your Windows/System32 folder. Then you can cast your WebBrowser Control like this:

SHDocVw.WebBrowser_V1 browser = (SHDocVw.WebBrowser_V1)(WebBrowser)w.ActiveXInstance;

You can have it close right away after storing the response. Well good luck and let me know how it goes.

Proximo
  • 6,235
  • 11
  • 49
  • 67
0

Is the information not in the WebBrowser.DocumentText? http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.documenttext.aspx

Fraser
  • 15,275
  • 8
  • 53
  • 104
-1

Do you control the web page?

If so, here's a blog post that explains how: http://www.palladiumconsulting.com/blog/sebastian/2007/04/ultimate-intranet-toy.html

If not, there's probably a solution but I can't help you, sorry.

Chloraphil
  • 2,719
  • 7
  • 35
  • 44
  • http://web.archive.org/web/20071009143308/http://www.palladiumconsulting.com/blog/sebastian/2007/04/ultimate-intranet-toy.html – Chloraphil Jun 25 '14 at 13:30