0

I have been trying off and on for the past couple of days to get the text from an HTML element into my listbox. The URL is https://www.roblox.com/users/88510187/favorites#!/places and I am trying to scrape the favorite places of the user, for this instance the first would be "B is for build" here is what I have most recently tried and get nothing.

  Private Sub test3()
    Dim divs = WebBrowser1.Document.Body.GetElementsByTagName("div")
    For Each d As HtmlElement In divs
        If d.GetAttribute("className") = "text-overflow item-card-name ng-binding" Then
            ListBox1.Items.Add(d.OuterText)

        End If
    Next
End Sub

I've changed the sub around many times playing with inner and outer text and Html but cannot seem to get just plain text. What on earth am I missing?

King96
  • 47
  • 2
  • 14
  • Not sure, but I think You should use `d.InnerHTML` instead `d.OuterText` ... btw. You can read more about `OuterText` here https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/outerText – nelek Oct 18 '18 at 15:14
  • Tried that one too, it acts like there is not a class with that name but there is, I listed them all out with Dim PageElement As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("div") For Each CurElement As HtmlElement In PageElement ListBox1.Items.Add(CurElement.GetAttribute("className").ToString()) – King96 Oct 18 '18 at 15:17
  • 1
    You'ld get it with both `InnerHtml` and `InnerText`. When the Document is completed. You need to insert that code in the `WebBrowser.DocumentCompleted` event. – Jimi Oct 18 '18 at 15:18
  • Thats when the sub is fired, on document completed, maybe its not totaly done loading when it fires? – King96 Oct 18 '18 at 15:19
  • Yeap, You can see similar solution here https://stackoverflow.com/a/9205751/3279496 – nelek Oct 18 '18 at 15:21
  • Yea, it wasnt loaded all the way, I added a button and pressed after it loaded and it worked... 2 days trying and it was just that simple... – King96 Oct 18 '18 at 15:23
  • 1
    The `DocumentCompleted` event can be raised multiple times. You have to inspect the `WebBrowser1.ReadyState` property. You'll see that, possibly more than 1 time, it will be `WebBrowserReadyState.Interactive` instead of `WebBrowserReadyState.Completed`. And even then, sometimes you have to re-parse. – Jimi Oct 18 '18 at 15:36
  • Is there a way to know when the WebBrowser control has stopped the process or quit changing the Html? I tried PaigeWaiter() and still doesnt load in time – King96 Oct 18 '18 at 16:13
  • In the document complete event, check the url of event that fires it, if e.url = webbrowser.url your html should be loaded and it should be ready to scrape. – CruleD Oct 18 '18 at 20:41

0 Answers0