1

I am trying to build a macro to web scrape the status of a Cargo Shipment based on the shipment number. I am using the XML-HTTP method but I am new to VBA web scraping. I have tried to get the value by using the GetValuebyID,Tag, Class with no success.

The highlighted line is the one I need the value extracted from. [Need to Extract the 10 of 10 Delivered Value][1]

This is how far I have gotten with the code.

Sub FlightStat()

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim AllTables As IHTMLElementCollection
Dim MainTable As IHTMLTable


XMLReq.Open "GET", "https://www.unitedcargo.com/OurNetwork/TrackingCargo1512/Tracking.jsp?id=10205436&pfx=016", False

XMLReq.send

If XMLReq.Status <> 200 Then
    MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
    Exit Sub
End If

HTMLDoc.body.innerHTML = XMLReq.responseText

Set AllTables = HTMLDoc.getElementsByTagID("dispTable0")

  

End Sub

I would be grateful if someone could help me get the "10 of 10 Delivered" value extracted [1]: https://i.stack.imgur.com/xcOAZ.png

Achal Desai
  • 93
  • 1
  • 8
  • When you say "**with no success**", it doesn't tell us anything. Please explain what happens when you run your code.. is there an error thrown? if so, what is the error? if not, what happens? – Zac Sep 04 '20 at 10:38
  • The table you need will be load as dynamic content. So you can't use xhr (as far as I know) But you can do it with the IE. You must wait to load the dynamic content after the IE reports page load is complete. Furthermore there is no method called `getElementsByTagID()`. What you want is `getElementByID()`. – Zwenn Sep 04 '20 at 11:26
  • @Zac, Sorry for that. That was a typo. I was trying to type ByTag, ByID – Achal Desai Sep 04 '20 at 12:34

2 Answers2

1

Ok, like I wrote in my comment. You can scrape the status with the IE.

Please note: The following code has no timeout built in if the dynamic content cannot be loaded. There is also no check whether the number passed in the URL is correct.

Sub FlightStat()

Dim url As String
Dim ie As Object
Dim nodeTable As Object

  'You can handle the parameters id and pfx in a loop to scrape dynamic numbers
  url = "https://www.unitedcargo.com/OurNetwork/TrackingCargo1512/Tracking.jsp?id=10205436&pfx=016"

  'Initialize Internet Explorer, set visibility,
  'call URL and wait until page is fully loaded
  Set ie = CreateObject("InternetExplorer.Application")
  ie.Visible = False
  ie.navigate url
  Do Until ie.readyState = 4: DoEvents: Loop
  
  'Wait to load dynamic content after IE reports it's ready
  'We can do that in a loop to match the point the information is available
  Do
    On Error Resume Next
    Set nodeTable = ie.document.getElementByID("dispTable0")
    On Error GoTo 0
  Loop Until Not nodeTable Is Nothing
  
  'Get the status from the table
  MsgBox Trim(nodeTable.getElementsByTagName("li")(2).innertext)
  
  'Clean up
  ie.Quit
  Set ie = Nothing
  Set nodeTable = Nothing
End Sub
Zwenn
  • 2,147
  • 2
  • 8
  • 14
0

You can absolutely do it with xmlhttp. You just need the right endpoint to query. As it returns json you really should use a json parser, or Instr/InstrRev (for small simple string extraction). However, as I didn't want to import an external dependency (other than ticking the add reference in VBE), and the response format is standard, I went with regex. The 10 of 10 is calculated as the number of items delivered over the number of items received (start and end pieces), along with the first statusDescription; the latest tracking info always comes first in the string.

This will be much quicker than using a browser.

Option Explicit

Public Sub FlightStat()

    Dim XMLReq As New MSXML2.XMLHTTP60, re As VBScript_RegExp_55.RegExp 'required reference Microsoft VBScript Regular Expressions
    
    Set re = New VBScript_RegExp_55.RegExp
    
    With XMLReq
    
        .Open "GET", "https://www.unitedcargo.com/TrackingServlet?BranchCode=&CompanyName=Test&DocumentNumbers=016-10205436&UserName=&_=" & toUnix(Now()), False
        .send

        If .status <> 200 Then
            MsgBox "Problem" & vbNewLine & .status & " - " & .statusText
            Exit Sub
        End If
    
        Dim s As String, output As String, matches As VBScript_RegExp_55.MatchCollection
        
        s = .responseText
        
    End With

    With re
    
        .Pattern = """Pieces"":""(.*?)"""
        .Global = True
        
        Set matches = .Execute(s)
        
        Dim status As String
        
        .Pattern = "StatusDescription"":""(.*?)"""
        .Global = False
        status = .Execute(s)(0).SubMatches(0)
        output = matches.Item(0).SubMatches(0) & " of " & matches.Item(matches.Count - 1).SubMatches(0) & Chr$(32) & status
        
        Debug.Print output
        
    End With
End Sub

Public Function toUnix(ByVal dt As Variant) As Long
    '@TimWilliams https://stackoverflow.com/a/12326121
    toUnix = DateDiff("s", "1/1/1970", dt)
End Function
QHarr
  • 83,427
  • 12
  • 54
  • 101