1

I am trying to copy specific data from a weather forecast table on https://www.wunderground.com. To be more exact, I am trying to get the Time and the Cloud Cover in Excel in a tabular format (but for now any format will do) from a link like https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15. So far I have tried many ways to get that specific data but I am falling short (I am new to web scraping with VBA). The concepts and commands are quite clear to me and they have worked for other sites, but for this site I am at wits' end. Currently, I am using:

Sub WeatherScrap()

Range("A1").Select

Dim mainlink As String Dim http As New XMLHTTP60, html As New HTMLDocument Dim CloudCover As Object

mainlink = "https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15"

    With http
        .Open "GET", mainlink, False
        .send
        html.body.innerHTML = .responseText
    End With

    For Each CloudCover In html.getElementsByClassName("wu-value wu-value-to")
        ActiveCell.Value = CloudCover.innerText
        ActiveCell.Offset(1, 0).Select
    Next CloudCover

End Sub

I am obviously not referencing the right classes, tags or IDs on the html (I have tried with many so far, but none retrieve the desired data). The html element on the site is:

<lib-display-unit _ngcontent-app-root-c213="" _nghost-app-root-c122="" class="ng-star-inserted"><span _ngcontent-app-root-c122="" class="test- wu-unit wu-unit-chance ng-star-inserted"><!----><!----><!----><span _ngcontent-app-root-c122="" class="wu-value wu-value-to">100</span>&nbsp;<span _ngcontent-app-root-c122="" class="wu-label"><span _ngcontent-app-root-c122="" class="ng-star-inserted">%</span>

For now, just understanding how to fetch the Cloud Cover percentage from the table would suffice. Can anyone help? Thanks a lot!

BigBen
  • 46,229
  • 7
  • 24
  • 40
vbeu
  • 13
  • 1
  • 4
  • It looks right, did you check that your `http.responseText` actually contains the HTML you expect, and not some redirect or a CloudFlare CAPTCHA page or something like that? – CherryDT Apr 15 '20 at 12:50
  • On second glance, it looks weird that you are assigning the HTML string to the _body_ contents of the `HTMLDocument`. You'd end up with a document like this `
    ...` if you know what I mean, and I'm not sure this is handled correctly. Can't you use `CreateDocumentFromUrl` directly, or at least `Write` into the document instead of assigning `InnerHTML`? https://stackoverflow.com/questions/9995257/mshtml-createdocumentfromstring-instead-of-createdocumentfromurl
    – CherryDT Apr 15 '20 at 12:53
  • Why not try link using Power Query? – Dean Apr 15 '20 at 13:00
  • 1
    @CherryDT This is not the issue. The website supplied uses a JavaScript API to obtain the weather information it displays. You will have to use this API as well if needed. E.G. https://api.weather.com/v2/pws/observations/current?apiKey=6532d6454b8aa370768e63d6ba5a832e&units=e&stationId=IPLOIETI7&format=json or perhaps https://api.weather.com/v3/wx/forecast/daily/15day?apiKey=6532d6454b8aa370768e63d6ba5a832e&geocode=45%2C26.441&language=en-US&units=e&format=json depending which information is required. In general though you should get your own API Key... – Sancarn Apr 15 '20 at 13:04
  • OK, right, I didn't look at the page in question, I assumed the OP knew what HTML they were dealing with. Thank you, this is the way to go – CherryDT Apr 15 '20 at 13:07
  • Thanks a lot, @Sancarn!. Is there any way you can point me to the right documentation? as I was saying I am fairly new in this field and I would appreciate any more precise indication on how to get this data to my Excel. How do I know the API they are using and how do I get my own API Key? Thanks so much... – vbeu Apr 15 '20 at 16:08
  • @vbeu I see someone has since answered the question, if you are still interested I got the information I needed from google chrome's network tab. https://developers.google.com/web/tools/chrome-devtools/network So when you log into the page, you inspect the HTTP calls made by the website, and thus inspect which calls you need to make. As for getting your own API key, create a wunderground.com account and then go to https://www.wunderground.com/weather/api/ Assume this will have the required documentation you need to get a key :) – Sancarn Apr 16 '20 at 12:05

1 Answers1

0

I didn't read your whole question, but I'm guessing you want something like this (which is a pretty common way of interacting with a web-based table).

Sub Web_Table()
    Dim HTMLDoc As New HTMLDocument
    Dim objTable As Object
    Dim lRow As Long
    Dim lngTable As Long
    Dim lngRow As Long
    Dim lngCol As Long
    Dim ActRw As Long
    Dim objIE As InternetExplorer
    Set objIE = New InternetExplorer
    objIE.Navigate "https://www.wunderground.com/hourly/ro/mizil/45.00,26.44/date/2020-04-15"

    Do Until objIE.ReadyState = 4 And Not objIE.Busy
        DoEvents
    Loop
    Application.Wait (Now + TimeValue("0:00:03")) 'wait for java script to load
    HTMLDoc.body.innerHTML = objIE.Document.body.innerHTML
    With HTMLDoc.body
        Set objTable = .getElementsByTagName("table")
        For lngTable = 0 To objTable.Length - 1
            For lngRow = 0 To objTable(lngTable).Rows.Length - 1
                For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
                    ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
                Next lngCol
            Next lngRow
            ActRw = ActRw + objTable(lngTable).Rows.Length + 1
        Next lngTable
    End With
    objIE.Quit
End Sub

Result:

enter image description here

ASH
  • 20,759
  • 19
  • 87
  • 200
  • - Great thanks! I just had to increase the wait time as for me the java script is loading slower. Thanks a lot! – vbeu Apr 16 '20 at 07:58
  • Do Until objIE.ReadyState = 4 And Not objIE.Busy is supposed to handle that automatically. Maybe something got mangled when you tried to run it. Anyway, I'm glad t worked out for you!! – ASH Apr 16 '20 at 13:00