Parse XML node names and content
"But it is only printing the parent node and
with in which it is printing the child not value.
But I need the name of the child note (sic!) also."
The original post doesn't take into account the special xml node hierarchy:
- A node element can dispose or not of one or several childnodes.
- A node's childnode can be a text element or itself e.g. a node element.
- A node's
.Text
property alone displays a joined string of the text elements of any subordinated childnodes.
So each complete parse action over several hierarchy levels includes a check for child nodes (.HasChildNodes
property).
In order not to loose a clear view over nested levels I urgently recommend a recursive approach.
This will be demonstrated by the main function listChildNodes()
.
This function uses late binding, but could be changed to early binding, too
by modifying the object declarations to precise MSXML2 declaration types.
Note that early binding would also use a slightly different DOMDocument type identification:
'(Early binding)
Dim xDoc As MSXML2.DOMDocument60 ' (or MSXML2.DOMDocument for old version 3.0)
Set xDoc = New MSXML2.DOMDocument60 ' set instance to memory
'LATE Binding
Dim xDoc As Object
Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")
To allow other users to avoid writing and saving an external file, it would be possible to load a xlm content string directly via .LoadXML
(instead of .Load
)
Dim XMLContentString as String
XMLContentString = "<?xml version="1.0" encoding="UTF-8"?><note>...</note>"
If xDoc.LoadXML(XMLContentString) Then
' ...
End If
Example call (including declaration head)
As additional feature this flexible example call not only displays
- node names and
- text contents (including possible
<!-- comments -->
),
- but also outputs a chapter-like id in the first target column. So the subordinated childnodes to the
<Me>
parent node (id# 6
) will be marked by 6.1
and 6.2
.
To memorize hierarchy levels a user defined type gets defined in the code module's declaration head.
(Note that I used the original xml content not changing the possible typo "exmaple111" in node Example [@id='exmaple111']
).*
Of course the initial XPath search can be modified to any other subnode request.
Option Explicit ' declaration head of code module
Type TLevels ' user defined type
levels() As Long
oldies() As String
End Type
Dim mem As TLevels ' declare array container for u.d.type
Sub ExampleCall()
ReDim mem.levels(0 To 4) ' define current level count
ReDim mem.oldies(0 To 4) ' define level ids
Dim xFileName As String
xFileName = ThisWorkbook.Path & "\xml\hierarchy.xml" ' << change to your needs
Dim xDoc As Object
Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")
xDoc.async = False
xDoc.validateOnParse = False
If xDoc.Load(xFileName) Then
' [1] write xml info to array with exact or assumed items count
Dim data As Variant: ReDim data(1 To xDoc.SelectNodes("//*").Length, 1 To 3)
' start call of recursive function
listChildNodes xDoc.DocumentElement.SelectSingleNode("Example[@id='exmaple111']"), data ' call help function listChildNodes
' [2] write results to target sheet ' << change to project's sheet Code(name)
With Sheet1
Dim r As Long, c As Long
r = UBound(data): c = UBound(data, 2)
'write titles
.Range("A1").Resize(r, c) = "" ' clear result range
.Range("A1").Resize(1, c) = Split("ID,NodeName,Text", ",") ' titles
'write data field array to target
.Range("A2").Resize(r, c) = data ' get 2-dim data array
End With
Else
MsgBox "Load Error " & xFileName
End If
Set xDoc = Nothing
End Sub

Recursive main function listChildNodes()
Note that late binding XML doesn't allow the use of IXMLDOMNodeType enumeration constants
e.g. 1 ... `NODE_ELEMENT`, 2 ... `NODE_ATTRIBUTE`, 3 ... `NODE_TEXT` etc.,
so you have to take the numeric equivalents.
Function listChildNodes(curNode As Object, _
ByRef v As Variant, _
Optional ByRef i As Long = 1, _
Optional curLvl As Long = 0 _
) As Boolean
' Purpose: assign the complete node structure to a 1-based 2-dim array
' Author: https://stackoverflow.com/users/6460297/t-m
' Date: 2021-04-04
' Escape clause
If curNode Is Nothing Then Exit Function
If i < 1 Then i = 1 ' one based items Counter
' Increase array size .. if needed
If i >= UBound(v) Then ' change array size if needed
Dim tmp As Variant
tmp = Application.Transpose(v) ' change rows to columns
ReDim Preserve tmp(1 To 3, 1 To UBound(v) + 1000) ' increase row numbers
v = Application.Transpose(tmp) ' transpose back
Erase tmp
End If
' Declare variables
Dim child As Object ' late bound node object
Dim bDisplay As Boolean
Dim prevLvl As Long
' Distinguish between different node types
Select Case curNode.NodeType
Case 3 ' 3 ... NODE_TEXT
' ---------------------------------------------------------------------
' A. It's nothing but a TextNode (i.e. a parent node's firstChild!)
' ---------------------------------------------------------------------
' write pure text content (NODE_TEXT) of parent elements
v(i, 3) = curNode.Text ' nodeValue of text node
' return boolean (i.e. yes, I'v found no further child elements)
listChildNodes = True
Exit Function
Case 1 ' 1 ... NODE_ELEMENT
' --------------------------------------------------------------
' B.1 NODE_ELEMENT WITHOUT text node immediately below,
' a) i.e. node followed by another node element <..>,
' (i.e. FirstChild.NodeType MUST not be of type NODE_TEXT = 3)
' b) or node element without any child node
' Note: a text content (NODE_TEXT) actually is a child node(!) to an element node
' (see section A. getting the FirstChild of a NODE_ELEMENT)
' --------------------------------------------------------------
If curNode.HasChildNodes Then
' a) display element followed by other Element nodes
If Not curNode.FirstChild.NodeType = 3 Then ' <>3 ... not a NODE_TEXT
bDisplay = True
End If
Else ' empty NODE_ELEMENT
' b) always display an empty node element
bDisplay = True
End If
If bDisplay Then
'write id + nodename
v(i, 1) = getID(v, i, curLvl)
v(i, 2) = curNode.nodeName
v(i, 2) = v(i, 2) & " " & getAtts(curNode)
i = i + 1
End If
' --------------------------------------------------------------
' B.2 check child nodes via recursion
' --------------------------------------------------------------
For Each child In curNode.ChildNodes
' ~~~~~~~~~~~~~~~~~~~~
' >> recursive call <<
' ~~~~~~~~~~~~~~~~~~~~
bDisplay = listChildNodes(child, v, i, curLvl + 1)
If bDisplay Then
'write id + nodename
v(i, 1) = getID(v, i, curLvl)
v(i, 2) = curNode.nodeName
v(i, 2) = v(i, 2) & " " & getAtts(curNode)
i = i + 1 ' increment counter
End If
Next child
Case 8 ' 8 ... NODE_COMMENT
' --------------------------------------------------------------
' C. Comment
' --------------------------------------------------------------
v(i, 1) = getID(v, i, curLvl)
v(i, 2) = curNode.nodeName
v(i, 3) = "'<!-- " & curNode.NodeValue & "-->"
i = i + 1 ' increment counter
End Select
End Function
Help function getID()
Returns a chapter-like level numbering (here in target column A:A
)
Function getID(v, i, curLvl As Long) As String
'Purpose: return chapter-like level id
'Note : called by recursive function listChildNodes()
'Author : https://stackoverflow.com/users/6460297/t-m
'Date : 2021-04-04
'a) get previous level
Dim prevLvl As Long
If i > 1 Then prevLvl = UBound(Split(v(i - 1, 1), ".")) + 1
If curLvl Then
Dim lvl As Long
'b) reset previous levels
If curLvl < prevLvl Then
For lvl = curLvl + 1 To UBound(mem.levels)
mem.levels(lvl) = 0
Next
ElseIf curLvl > prevLvl Then
mem.levels(curLvl) = 0
End If
'c) increment counter
mem.levels(curLvl) = mem.levels(curLvl) + 1
'd) create id and remember old one
getID = "'" & Mid(mem.oldies(curLvl - 1), 2) & IIf(curLvl > 1, ".", "") & mem.levels(curLvl)
mem.oldies(curLvl) = getID
End If
End Function
Help function getAtts()
Additional feature returning attribute names and values (column B:B
):
Function getAtts(ByRef node As Object) As String
' Purpose: return attribute(s) string, e.g. 'id="example111"]'
' Note: called by recursive function listChildNodes()
' Author: https://stackoverflow.com/users/6460297/t-m
If node.nodeName = "#comment" Then Exit Function
Dim sAtts As String, ii As Long
If node.Attributes.Length > 0 Then
ii = 0: sAtts = ""
For ii = 0 To node.Attributes.Length - 1
sAtts = sAtts & "[@" & node.Attributes.Item(ii).nodeName & "='" & node.Attributes.Item(ii).NodeValue & "']"
Next ii
End If
' return function value
getAtts = sAtts
End Function