1

I have the following HTML code on 17track.net:

 <div class="tools"> <button type="button" class="btn btn-sm icon fa-copy-files-o waves- 
effect" data-toggle="tooltip" data-placement="top" data-original-title="Copy detailed 
 tracking results for all numbers." id="cl-details" data-clipboard-text="Number: 
 LW571320552CN
 Package status: Delivered (26 Days)
 Country: China -> United States
 Destination:
 2020-04-02 13:03, CATAWBA, SC 29704, Delivered, In/At Mailbox -> Your item was delivered in 
 or at the mailbox at 1:03 pm on April 2, 2020 in CATAWBA, SC 29704.
 2020-04-02 08:13, CATAWBA, SC 29704, Out for Delivery
 2020-04-02 08:02, CATAWBA, SC 29704, Arrived at Unit
 2020-04-01 10:59, MID CAROLINA-CHARLOTTE NC DISTRIBUTION CENTER, Arrived at USPS Regional 
 Destination Facility
 2020-03-31 00:00, In Transit to Next Facility
 2020-03-30 10:02, ALBANY NY DISTRIBUTION CENTER, Arrived at USPS Regional Facility
 2020-03-28 09:54, ISC NEW YORK NY(USPS), Processed Through Facility
 2020-03-28 09:54, Origin Post is Preparing Shipment

Effectively, I simply need to get the package status. I can also be OK with this just pasting the entire text into excel and I can then parse through it with VBA. For some reason, my VBA code is not working to retrieve this through a simple scrape. I am new to HTML so thank you for your help.

 Sub TrackData()

 Dim element As IHTMLElement
 Dim elements As IHTMLElementCollection
 Dim ie As InternetExplorer

 Dim html As HTMLDocument



 Set ie = New InternetExplorer

 ie.Visible = True
 ie.navigate "https://t.17track.net/en#nums=LW572098229CN"

 'Wait until IE has loaded the web page

 Do While ie.readyState <> READYSTATE_COMPLETE


 DoEvents

 Loop

 Set html = ie.document

 Set elements = html.getElementsByClassName("tools")

 Dim count As Long
 Dim erow As Long
 count = 0
 For Each element In elements
 If element.Children = "Package Status:" Then
 Sheets("Stage").Cells(1, 1) = "Found It"
 End If
 Next element



 End Sub
QHarr
  • 83,427
  • 12
  • 54
  • 101
Michael R
  • 93
  • 1
  • 8

1 Answers1

2

The value you want to search for is part of an attribute value; meaning, you would need to check either the .OuterHTML or the specific attribute data-clipboard-text value for that string. You don't need a loop over multiple nodes. You can use the id to target that node specifically. I would also add in the attribute name to the node selection to ensure that attribute is present without needing a hasAttribute check on the node. I would also add a small pause to ensure page has time to dynamically retrieve content and check that the node is actually found.

Option Explicit

Public Sub CheckForPackageStatus()

    Dim ie As New SHDocVw.InternetExplorer

    Set ie = New SHDocVw.InternetExplorer

    With ie

        .Visible = True
        .Navigate2 "https://t.17track.net/en#nums=LW572098229CN"

        Do: DoEvents: Loop While .Busy Or .ReadyState <> READYSTATE_COMPLETE

        Application.Wait Now + TimeSerial(0, 0, 1)

        Dim node As Object

        Set node = .Document.querySelector("#cl-details[data-clipboard-text]")

        If Not node Is Nothing Then
                If InStr(node.getattribute("data-clipboard-text"), "Package status:") > 0 Then
                    ActiveSheet.Cells(1, 1) = "Found it"
                End If
        End If
    End With
End Sub
QHarr
  • 83,427
  • 12
  • 54
  • 101
  • Fantastic! Thank you. Any way to do this without opening IE? – Michael R Apr 10 '20 at 11:38
  • So I found how to get the responseText without opening IE, and it is seemingly much faster, but the responseText doesn't seem to contain the attributes/elements I need. Why would that be? Is there any way to pull the element info without opening IE in a similar way to this? https://stackoverflow.com/questions/47592151/how-do-i-parse-html-without-creating-an-object-of-internet-explorer-in-vba – Michael R Apr 10 '20 at 12:10
  • Check the data isn't stored elsewhere. However, I think that if you wish to use xhr you first need to grab the appropriate cookies from the initial url then see if those will be accepted by the rest API call (assuming you are calling https://t.17track.net/restapi/track) Does look like the API is intended for commercial use though: https://help.17track.net/hc/en-us/articles/235261928-Does-17TRACK-have-API- – QHarr Apr 10 '20 at 12:29
  • I'm unsure what you mean by "stored elsewhere". Isn't the responstText the full HTML code from the site? – Michael R Apr 10 '20 at 12:32
  • The response text is the static initial content. Many modern webpages run javascript in the browser which updates content - you won't get that by using the initial url. You can use the network tab via dev tools to see if you can capture any calls the webpage does to update content. That is where I saw the API call which returns the content you are after. So, I think you need that endpoint to get the data but looks like it is not intended for public use. – QHarr Apr 10 '20 at 12:35
  • By stored elsewhere I mean that sometimes the response text has the content you want stored in a different attribute or in a script tag where it is accessed by the browser when running scripts and loaded into the place where you manually see it after rendering engine and javascript engine have done there thing on page load in the browser. – QHarr Apr 10 '20 at 12:36