6

Can somebody tell me how to use Optimus (headless browser) nuget package with C# to get response from a URL. I also want javascript on the page to be executed automatically like phantomjs.

Veverke
  • 9,208
  • 4
  • 51
  • 95
Puneet Pant
  • 918
  • 12
  • 37

1 Answers1

3

Quite a simple bit of kit:

  1. Create an Engine component first (common for dynamic and static pages):

    Engine engine = new Engine();

  2. Open the url of the html document you want to retreive:

    a) Not waiting for any elements added in with javascript:

    engine.OpenUrl("http://google.com").Wait();

    b) Waiting for any elements added in with javascript:

    engine.OpenUrl("http://google.com")

    and then either:

    • engine.WaitDesappearingOfId("some-id")
    • engine.WaitId("some-id")
    • engine.WaitDocumentLoad()
    • engine.WaitSelector("#some-id")
    • engine.WaitSelector(".some-class")

now you open the url, there are two ways of doing this - load the document (prior to any javascript being executed):

More complete examples:

public static string dynamicLoadingPage()
{
    var engine = new Engine();
    engine.OpenUrl("https://html5test.com");
    var tagWithValue = engine.WaitSelector("#score strong").FirstOrDefault();
    System.Console.WriteLine("Score: " + tagWithValue.InnerHTML);
}

Otherwise:

static string staticLoadingPage()
{
   var engine = new Engine();
   engine.OpenUrl("http://google.com").Wait();
   Console.WriteLine("The first document child node is: " + engine.Document.FirstChild);
   Console.WriteLine("The first document body child node is: " + engine.Document.Body.FirstChild);
   Console.WriteLine("The first element tag name is: " + engine.Document.ChildNodes.OfType<HtmlElement>().First().TagName);
   Console.WriteLine("Whole document innerHTML length is: " + engine.Document.DocumentElement.InnerHTML.Length);

}
SagarScript
  • 1,145
  • 11
  • 15
  • How do you get the HTTP status code of the request? Would be ideal to verify the page loaded correctly before attempting to access dom elements – Dan Hastings Jul 31 '18 at 08:50
  • I've managed to get the page loaded but fail to find any documentation on how to interact with the page, e.g. click an anchor that I've got using `QuerySelectorAll.Single()` - it just gives me something of type `IElement` but there's nothing clickable there, as far I can see. – Konrad Viltersten Sep 08 '19 at 09:50
  • Oh, I forgot to add - I need to atually **click** the button, not only read the *href* and navigate to it. The JS executed with the clicks adds something to the session/headers making a simple navigation not work out. – Konrad Viltersten Sep 08 '19 at 09:51