0

I am using Optimus as a headless browser in vb.net, specifically to locate html elements on a page.

The Optimus engine initiates and retrieves a page ok - I can see the full html - a google rearch result as it happens. However

Dim LClasses As ITokenList = _engine.Document.ClassList returns a 0 length list (even though I can see classes in the page source) and Dim LNodes As List(Of Node) = _engine.Document.ChildNodes returns only 2 nodes: html and head - even though there are of course dozens of other nodes.

Is vb.net supported? Am I doing something stupid?

(I have posted this same question on the Optimus Github page - apologies if duplicating it is bad form)

Taslim Oseni
  • 6,086
  • 10
  • 44
  • 69
nick_b
  • 21
  • 4
  • On further investigation it looks as if I have to build some kind of recursive function to do this - the engine just returns child nodes and classes for the current element. – nick_b Sep 22 '19 at 20:51
  • The developer has said: 1. Document.ClassList should be empty. In real browsers this property is undefined. 2. ChildNodes returns children, not descendants. Most often there are only two such nodes: DocType and html. 3. I can not guarantee that vb.net will work. But I see no reason why this should not be. Also You can use one of the approaches suggested here: https://stackoverflow.com/questions/26325278/how-can-i-get-all-descendant-elements-for-parent-container – nick_b Sep 23 '19 at 11:16
  • 1
    Ok here is how I got it to work: ... Imports Knyaz.Optimus Imports Knyaz.Optimus.Dom.Elements Imports System.Collections.ObjectModel Imports Knyaz.Optimus.Dom.Interfaces ... initiate the optimus engine (here called "_engine") and get a webpage ... Dim coll As ReadOnlyCollection(Of IElement) coll = _engine.Document.Body.QuerySelectorAll("*") 'this returns a collection of all the elements in the webpage - woohoo! – nick_b Sep 28 '19 at 09:54

0 Answers0