Using getElementsByClassName with VBA

Question

I wrote the following code in order to retrieve data from an API and hopefully get the content of a specific element. I wrote similar code that works with web pages (not API) but in this example it breaks and don't understand why.

Sub parseXML()
Dim xmldoc As Object
Dim obj As Object
Dim MyRequest As Object

Set MyRequest = CreateObject("MSXML2.XMLHTTP")
MyRequest.Open "GET", 
"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=ABL1%20Acute%20granulocytic%20leukemia"     
 MyRequest.send
 While Not MyRequest.readyState = 4
       DoEvents
    Wend
Set xmldoc = MyRequest.responseXML
Set obj = xmldoc.DocumentElement.getElementsByClassName("ui-ncbihistogram-display-area")(0) 'Error: object doesn't support this property or method 
End Sub

How should I write the last line in order to remove the error? I am trying to get the element class="ui-ncbihistogram-display-area". It has several <li>elements and at the end I want to retrieve the inner text of each of these<li> elements.

What data do you want to get from that document? What specific element? Use https://stackoverflow.com/posts/57132271/edit to edit/update the question, and add the information about exactly what data/element you want to get from the document. — sideshowbarker, Jul 21 '19 at 11:00

QHarr · Accepted Answer · 2019-07-21T12:52:10.060

You are trying to use the syntax that would work for an HTML Parser selecting off the pubmed search for the same term i.e. you would be selecting for the histogram on this page.

However, with the API call you are making you are getting an XML document which will have specified fields according the API documentation. Instead, you would use xpath to specify what you want if using xml parser (this will allow you to include more specificity in terms of attributes and parent child relationships). If you are going to use HTML parser then you want to look at selecting by tag. The field present in the webpage, for the histogram, is not present in the response. Nor is that information present in the return xml document. The figure that sort of matches is you get the total result count for the query i.e. 248. Though it is broken down by year in the webpage histogram whereas it is a total in the XML API response.

Sample comparison:

I would recommend using an XML parser for XML. You then get the expected result hopefully. Compare the two parsers here where there is loss of content in the HTMLDocument.

Public Sub DifferentParsing()
    Dim xmlDoc As Object, htmlDoc As HTMLDocument, url As String

    Set xmlDoc = CreateObject("MSXML2.DOMDocument") 'New MSXML2.DOMDocument60
    Set htmlDoc = New HTMLDocument
    url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?%20db=pubmed&term=ABL1%20Acute%20granulocytic%20leukemia"

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", url, False
        .send
        htmlDoc.body.innerHTML = .responseText
        xmlDoc.LoadXML .responseXML.XML
    End With
    Debug.Print xmlDoc.SelectSingleNode("//Count").Text
    Debug.Print htmlDoc.getElementsByTagName("Count")(0).innerText
End Sub

as for this line `Debug.Print htmlDoc.getElementsByTagName("Count")(0).innerText` doesn't return 248 as the previous line. I doubt that Count is a tag — YasserKhalil, Jul 21 '19 at 17:21
@YasserKhalil hi, look at my answer. The response loaded into the Htmldocument is partial. The beginning is stripped and in fact no opening tag. If you print html.body.innerhtml you will see. This was my point about using xml parser in this instance. — QHarr, Jul 21 '19 at 17:23
Thanks @QHarr for the detailed answer. While I did not achieve my goal you made it clear that the API returns something completely different than what I assumed so at least the error makes sense now. If I want to return the data I am looking for I will need to modify the API search to begin with. — David912, Jul 22 '19 at 07:24

Using getElementsByClassName with VBA

1 Answers1