I'm building my first VBA code to scrape data from a website. I'm able to open the site and navigate through a button to get the correct data on screen but I'm having difficulty referencing the correct table to loop through. I want to access an embedded table 'Activity'. To do this, I've taken the answer from here on how to cycle through a table and extract information, and embedded in my code. Below are three areas through errors.
Are these linked (particularly query B&C) and does anyone have any ideas?
Thanks so much!
------Solution Code (from QHarr answer below)-------------
Note: requires References (VBE > Tools > References and add references to): Microsoft Internet Controls Microsoft HTML Object Library
Public Sub GetTable()
Dim IE As InternetExplorer, ele As Object, clipboard As Object, hTable As htmlTable, t As Date, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Const MAX_WAIT_SEC As Long = 20
Set IE = New InternetExplorer
With IE
.Visible = True
.navigate "https://na3.docusign.net/Member/EmailStart.aspx?a=59595fcb-34be-4375-b880-a0be581d0f37&r=f6d28b49-e66d-4fa4-a7e9-69c2c741fde5"
While .Busy Or .readyState < 4: DoEvents: Wend
t = Timer
Do
DoEvents
On Error Resume Next
Set ele = .Document.querySelector("[data-qa='show-history']")
'On Error GoTo 0 'I removed this line as it was throwing an error as soon as the 'Show-history' element loaded.
If Timer - t > MAX_WAIT_SEC Then Exit Do
Loop While ele Is Nothing
If ele Is Nothing Then Exit Sub
ele.Click
While .Busy Or .readyState < 4: DoEvents: Wend
Set hTable = .Document.querySelector("[data-qa='history-dialog-audit-logs']")
''**********************************************************************
'' Loop table and write out method. This method uses the sub WriteTable
Application.ScreenUpdating = False
WriteTable hTable, 1, ws
Application.ScreenUpdating = True
''**********************************************************************
.Quit
End With
End Sub
Public Sub WriteTable(ByVal hTable As htmlTable, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
If ws Is Nothing Then Set ws = ActiveSheet
Dim tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, c As Long
r = startRow
With ws
Set tRow = hTable.getElementsByTagName("tr")
For Each tr In tRow
Set tCell = tr.getElementsByTagName("td")
c = 1
For Each td In tCell
.Cells(r, c).Value = td.innerText
c = c + 1
Next td
r = r + 1
Next tr
End With
End Sub
------Original Query Code-------
Query A: I get an object required error when the page is loading, which goes away if I continue with the script, so I believe is an issue with handling the loading time? It occurs after the 'loop' code finishes:
With objIE
.Visible = True
.navigate WebSite
Do While .Busy Or .readyState <> 4
DoEvents
Loop
.document.querySelector("[data-qa='show-history']").Click
Query B: I get another object required error on this line, which I can also continue past:
For Each ele In objIE.document.getElementById("activity").getElementsByTagName("tr")
Query C: I get a subscript out of range error on the following line, and can progress no further
Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent
Sub googlesearch3()
Set objIE = CreateObject("InternetExplorer.Application")
WebSite = "websiteurl"
With objIE
.Visible = True
.navigate WebSite
Do While .Busy Or .readyState <> 4
DoEvents
Loop
.document.querySelector("[data-qa='show-history']").Click
End With
'within the 'history-dialog-audit-logs' tabe, loop and extract data
'we will output data to excel, starting on row 1
y = 1
'look at all the 'tr' elements in the 'table' with id 'myTable',
'and evaluate each, one at a time, using 'ele' variable
For Each ele In objIE.document.getElementById("activity").getElementsByTagName("tr")
'show the text content of 'tr' element being looked at
Debug.Print ele.textContent
'each 'tr' (table row) element contains 4 children ('td') elements
'put text of 1st 'td' in col A
Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent
'put text of 2nd 'td' in col B
Sheets("Sheet1").Range("B" & y).Value = ele.Children(1).textContent
'put text of 3rd 'td' in col C
Sheets("Sheet1").Range("C" & y).Value = ele.Children(2).textContent
'put text of 4th 'td' in col D
Sheets("Sheet1").Range("D" & y).Value = ele.Children(3).textContent
'increment row counter by 1
y = y + 1
'repeat until last ele has been evaluated
Next
'check if word 'completed' is mentoined anwhere, if so update 'Status' to 'Completed' and search for text.
'Find "signed the envelope" and show all text before this until you find <td?. Stop after one occurance
'store text in 'LastSigned'string
'find "sent an invitation to" and show all text before this until you find <td>. Stop after one occurance
'store text in 'CurrentlyWith' sting
Set IE = Nothing
End Sub
Additional: I've tried the answer here, but the DIM statements didn't work...