1

I am trying to use a VBA macro to parse XML file. Given with the following structure:

  <bookstore>
  <book category="children">
    <title>Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title>Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
  </bookstore>

How can I enumerate the output with element tags with its corresponding values as shown below?

book | category | children
title | harry potter
author | J K. Rowling
...

My code as follows:

Set xmlFile = CreateObject("Microsoft.XMLDOM")
xmlFile.Load (file)
Set qXML = xmlFile.SelectNodes("/bookstore")
For i = 0 To qXML.Length - 1
  Debug.Print CStr(qXML(i).Text)
Next i
T.M.
  • 9,436
  • 3
  • 33
  • 57
onlineCake
  • 13
  • 4
  • You edited your question after I answered it. Did you read it already? Should answer your question. – AHeyne Dec 14 '20 at 11:00
  • Thanks for your reply. Yes, slight change for clarity but no change to the code at all. I tried to run with the solution you provided and it shows runtime error '91', object variable or with block variable not set. – onlineCake Dec 14 '20 at 13:03
  • I expected you code was running so far. But it seems its not like this. I used early binding in my project and so different code to instantiate the object. I edit my answer now. – AHeyne Dec 14 '20 at 13:16

2 Answers2

2

How to get Tag Names

"What's the XPath syntax to get Tag Names?"

Strictly speaking it's (XML)DOM syntax to to get .Name and/or .NodeName properties; XMLDOM (Document Object Model) is a cross-platform and language-independent interface treating the document as a tree structure and allowing programmatic access to the tree.

You can use, however the special syntax of XPath expressions (like e.g. "/bookstore/book/title") to address any logical part in the hierarchical xml document structure.

So a solution close to your OP would be:

Option Explicit             ' declaration head of your code module

Sub ExampleCall()
    Dim file As String: file = ThisWorkbook.Path & "\xml\bookstore.xml"
    Dim xmlFile As Object
    Set xmlFile = CreateObject("Microsoft.XMLDOM")
    If xmlFile.Load(file) Then
        Dim qXML As Object
        Set qXML = xmlFile.DocumentElement.SelectNodes("book")
        Dim q As Object
        For Each q In qXML
            Dim cnt As Long: cnt = cnt + 1
            Debug.Print Format(cnt, "--- 000 ---")
            Debug.Print q.Attributes(0).Name, "|" & q.Attributes(0).Text
            Dim i As Long
            For i = 0 To q.ChildNodes.Length - 1
                Debug.Print q.ChildNodes(i).nodeName, "|" & q.ChildNodes(i).Text
            Next
        Next
    End If
End Sub

Results in VBE's immediate window

--- 01 ---
category      |children
title         |Harry Potter
author        |J K. Rowling
year          |2005
price         |29.99
--- 02 ---
category      |web
title         |Learning XML
author        |Erik T. Ray
year          |2003
price         |39.95

Side note

As Microsoft.XMLDOM has been deprecated for years, I'd prefer binding to ►MSXML2 in the most current xml version Microsoft XML,v6.0, e.g. via

I. LATE Binding (as in OP)

    Dim xDoc As Object
    Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")

II. EARLY Binding

    Dim xDoc As MSXML2.DOMDocument60     ' *) whereas MSXML2.DOMDocument (=old version 3.0)
    Set xDoc = New MSXML2.DOMDocument60  ' mind the missing point in digits

Side note: OP uses the object variable XMLFile instead of xDoc

Note that referencing DOMDocument without obvious versioning would bind internally to 3.0 by default (the last stable version before 6.0, any other versions are deprecated).

Further links

T.M.
  • 9,436
  • 3
  • 33
  • 57
1

I suggest to use early binding. So in your project in the VBE add a reference (menu tools/references) to Microsoft XML, v6.0.

To determine the attribute and the values you can use this:

Dim xmlFile As MSXML2.DOMDocument60
Set xmlFile = New MSXML2.DOMDocument60

xmlFile.Load file

Dim qXML As MSXML2.IXMLDOMNodeList
Set qXML = xmlFile.SelectNodes("/bookstore/book")

Dim index As Long
For index = 0 To qXML.Length - 1
    Debug.Print qXML(index).SelectSingleNode("@category").Text
    Debug.Print qXML(index).SelectSingleNode("title").Text
    Debug.Print qXML(index).SelectSingleNode("author").Text
Next index
AHeyne
  • 3,377
  • 2
  • 11
  • 16
  • Friendly hint: The unchanged XPath expression from OP tries to address a *single* element, i.e. the possible `documentElement` and not a *NodeList*: try `Set qXML = xmlFile.SelectNodes("//book")` instead :-) – T.M. Dec 14 '20 at 20:09
  • @T.M.: Thanks for you hint, but unfortunately I don't understand what you try to tell me. Would you be so kind and be more descriptive? – AHeyne Dec 15 '20 at 15:24
  • Look one hierarchy level deeper: Your code raises a `Run time error 91` *(Object variable or With block variable not set)* as your node selections loop through not existing path queries within the `bookstore` object `qXML` (being also the documentElement/top level) and instead through the `book` elements the OP wants to analyze. Therefore the `bookstore` element cannot display sub nodes/attributes like `@category`, `title` or `author`, only the subsequent `book` nodes :-) – T.M. Dec 15 '20 at 18:53
  • 1
    Oh, darn, I missed that one. That must have happened during editing the answer. My test code ran here. Thanks for the info. I corrected that now. – AHeyne Dec 15 '20 at 19:08