1

I'm new to automating webpage access, so forgive what is probably a remedial question. I'm using C#/Windows.Forms in a console app. I need to programmatically enter the value of an input on a webpage that I cannot modify and that is running javascript. I have successfully opened the page (triggering WebBrowser.DocumentCompleted). I set browser emulation mode to IE11 (in registry), so scripts run without errors. When DocumentCompleted() triggers, I am unable to access the document elements without first viewing the document content via MessageBox.Show(), which is clearly not acceptable for my unattended app.

What do I need to do so that my document elements are accessbile in an unattended session (so I can remove MessageBox.Show() from the code below)? Details below. Thank you.

The input HTML is:

<input class="input-class" on-keyup="handleKeyPress($key)" type="password">

My DocumentCompleted event handler is:

    private static void LoginPageCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        WebBrowser wb = ((WebBrowser)sender);

        var document = wb.Document;

    // I'm trying to eliminate these 3 lines
        var documentAsIHtmlDocument = (mshtml.IHTMLDocument)document.DomDocument;
        var content = documentAsIHtmlDocument.documentElement.innerHTML;
        MessageBox.Show(content);

        String classname = null;
        foreach (HtmlElement input in document.GetElementsByTagName("input"))
        {
            classname = input.GetAttribute("className");

            if (classname == "input-class")
            {
                input.SetAttribute("value", password);
                break;
            }
        }
   }
Tony
  • 363
  • 2
  • 8
  • It is possible that when you try to access the document, its completion is only partial. The `DocumentCompleted()` event can be triggered by IFrames, for example. Try adding a check: `if (webBrowser1.ReadyState == WebBrowserReadyState.Complete) { (...) };` before attempting to parse it. – Jimi Mar 17 '18 at 03:54
  • @Jimi - Thank you very much for your suggestion. Your suggested check confirms that the page `ReadyState` is `Complete`; however, the document elements are still not accessible until after the MessageBox.Show() statement. – Tony Mar 17 '18 at 14:04
  • Try it with the code in the example. I just tested it with a WebForm login page, an the password is correctly inserted in Input container. – Jimi Mar 17 '18 at 16:24
  • @Jimi - I'm trying to do this in a ConsoleApp. Is it possible that WebBrowser controls don't work properly in a ConsoleApp? If so, that would explain a bit. – Tony Mar 17 '18 at 21:13
  • I'm going to test this in a Console project. I don't recall any difference in behaviour, but a fresh double check doesn't hurt. – Jimi Mar 17 '18 at 21:29

2 Answers2

1

The problem for me was that the page I'm accessing is being created by javascript. Even though documentComplete event was firing, the page was still not completely rendered. I have successfully processed the first page by waiting for the document elements to be available and if not available, doing Application.DoEvents(); in a loop until they are, so I know now that I'm on the right track.

This SO Question helped me: c# WebBrowser- How can I wait for javascript to finish running that runs when the document has finished loading?

Note that checking for DocumentComplete does not accurately indicate the availability of the document elements on a page generated by javascript. I needed to keep checking for the elements and running Application.DoEvents() until they became available (after the javascript generated them).

Tony
  • 363
  • 2
  • 8
  • My completed solution using the lessons learned in this thread is posted here for further discussion: https://stackoverflow.com/questions/49350908/automatically-rebooting-a-verizon-fios-quantum-router-via-c-webbrowser-consolea – Tony Mar 18 '18 at 17:40
0

If the problem comes from the creation of a STAThread, necessary to instantiate the underlying Activex component of WebBrowser control, this is a modified version of Hans Passant's code as shown in the SO Question you linked.

Tested in a Console project.

class Program
{
    static void Main(string[] args)
    {
        NavigateURI(new Uri("[SomeUri]", UriKind.Absolute), "SomePassword");
        Console.ReadLine();
    }

    private static string SomePassword = "SomePassword";

    private static void NavigateURI(Uri url)
    {
        Thread thread = new Thread(() => {
            WebBrowser browser = new WebBrowser();
            browser.DocumentCompleted += browser_DocumentCompleted;
            browser.Navigate(url);
            Application.Run();
        });
        thread.SetApartmentState(ApartmentState.STA);
        thread.Start();
    }

    protected static void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        WebBrowser browser = ((WebBrowser)sender);
        if (browser.Url == e.Url)
        {
            while (browser.ReadyState != WebBrowserReadyState.Complete)
            { Application.DoEvents(); }

            HtmlDocument Doc = browser.Document;
            if (Doc != null)
            {
                foreach (HtmlElement input in Doc.GetElementsByTagName("input"))
                {
                    if (input.GetAttribute("type") == "password")
                    {
                        input.InnerText = SomePassword;
                        //Or
                        //input.SetAttribute("value", SomePassword);
                        break;
                    }
                }
            }
            Application.ExitThread();
        }
    }
}
Jimi
  • 29,621
  • 8
  • 43
  • 61
  • Thank you very much for your help. This has gotten more interesting. My code works when I call `MessageBox.Show()` with any string (doesn't need to be content). My responses to your suggested steps are as follows: – Tony Mar 17 '18 at 20:57
  • My code is virtually identical to yours with the exception of the MessageBox.Show() and a change in the way I set the password value (I found that I needed to set the InnerText rather than the "value" Attribute). Let me confirm what I have working and post my code. Stay tuned... – Tony Mar 17 '18 at 21:09
  • @Tony I've posted this code so that, if you use it, we're on the same page on what's going on and it's possible to proceed from there. Setting `InnerText` or `Value` is the same thing. – Jimi Mar 17 '18 at 21:16
  • @Tony I have updated the code. There was silly leftover. See `if (Doc != null)`. Doesn't change the results on my side but it was wrong anyway. – Jimi Mar 17 '18 at 21:25
  • I appreciate your help with this. Now that I know enough about this to be dangerous, I also know what to look for. It appears that I have some work to do to get this to work in a consoleApp: https://stackoverflow.com/questions/4269800/webbrowser-control-in-a-new-thread/4271581#4271581 – Tony Mar 17 '18 at 21:40
  • Well. I thought that your problem was parsing/accessing the document, not to raise the event. That's why I didn't setup a console project for this. A Console thread that hosts an activex component must be set to `[STAThread]` in a way or another, that's for sure. Is your problem related to the threading model or to document handling? If your procedure can raise the event, the threading model issue should have been already dealt with. Or not? – Jimi Mar 17 '18 at 21:51
  • @Tony This is now a modified version of the code you linked. If you give it a try, let me know how it goes. If you think you won't need it, I'll delete this answer, because it's not clear what the problem really is. – Jimi Mar 17 '18 at 22:42
  • thank you for your patience and persistence! Your code is now virtually identical to my first attempt. When I referenced the other SO question, I was thinking that Noseratio's `async/wait` solution would work, but that did not work either. Whether I use [STAThread] (my original attempt) or async/wait, I still don't see the document elements until after I show a MessageBox (any MessageBox will do). I now suspect that this has something to do with the fact that the page I'm accessing is created by javascript. How do I post my complete code here for review? Do I post it as an answer? – Tony Mar 18 '18 at 14:01
  • 1
    The code you see now, like the previous version, is working (when I wrote "Tested in a Console project" I meant it). You could share the link to the problematic Html page, so it can be analyzed. If you think it could be useful, edit your question and post the relevant code that better describes the problem, stating what's expected and what happens instead, in a defined context. Should you come up with a working solution, post the answer to your own question. – Jimi Mar 18 '18 at 14:21
  • Thank you very much for youir help. Talking this out with you helped me to find what was ultimately the solution (and the answer I posted). – Tony Mar 18 '18 at 14:46
  • 1
    @Tony Well, I'm glad you made it :) – Jimi Mar 18 '18 at 14:48