1

This function retrieves the XPaths of an XML document:

''' <summary>
''' Gets all the XPath expressions of an <see cref="Xml.XmlDocument"/> document.
''' </summary>
''' <param name="Document">Indicates the <see cref="Xml.XmlDocument"/> object.</param>
''' <returns>List(Of System.String).</returns>
Public Function GetXmlXPaths(ByVal Document As Xml.XmlDocument) As List(Of String)

    Dim XPathList As New List(Of String)

    Dim XPath As String = String.Empty

    For Each Child As Xml.XmlNode In Document.ChildNodes

        If Child.NodeType = Xml.XmlNodeType.Element Then
            GetXmlXPaths(Child, XPathList, XPath)
        End If

    Next ' child

    Return XPathList

End Function

''' <summary>
''' Gets all the XPath expressions of an <see cref="Xml.XmlNode"/>.
''' </summary>
''' <param name="Node">Indicates the <see cref="Xml.XmlNode"/>.</param>
''' <param name="XPathList">Indicates a ByReffered XPath list as a <see cref="List(Of String)"/>.</param>
''' <param name="XPath">Indicates the current XPath.</param>
Private Sub GetXmlXPaths(ByVal Node As Xml.XmlNode,
                         ByRef XPathList As List(Of String),
                         Optional ByVal XPath As String = Nothing)

    XPath &= "/" & Node.Name

    If Not XPathList.Contains(XPath) Then
        XPathList.Add(XPath)
    End If

    For Each Child As Xml.XmlNode In Node.ChildNodes

        If Child.NodeType = Xml.XmlNodeType.Element Then
            GetXmlXPaths(Child, XPathList, XPath)
        End If

    Next ' child

End Sub

This is an usage example:

Private Sub Test() 

    Dim xml As New Xml.XmlDocument
    xml.LoadXml((<?xml version="1.0" encoding="Windows-1252"?>
                 <!--XML Songs Database-->
                 <Songs>
                     <Song><Name>My Song 1.mp3</Name></Song>
                     <Song><Name>My Song 2.ogg</Name></Song>
                     <Song><Name>My Song 3.wav</Name></Song>
                 </Songs>).ToString)

    Dim xPathList As List(Of String) = GetXmlXPaths(xml)

    ListBox1.Items.AddRange(xPathList.ToArray)


End Sub

I would like to improve the function to avoid the recursion at the same time I implement an Iterator then:

Public Iterator Function GetXmlXPaths(ByVal document As Xml.XmlDocument) _
As IEnumerable(Of String)

    Yield ...

End Function

How I could do it?.

ElektroStudios
  • 19,105
  • 33
  • 200
  • 417
  • 1
    Any specific reason to why you want to remove the recursion? – Magnus Feb 17 '15 at 19:35
  • For two reasons, the first is just for aesthetical (its very ugly to call a method outside), the second is more a important thing, the iterator implementation which will improve the execution time on very large XML documents since it remembers the index on the collection. thanks for comment. – ElektroStudios Feb 17 '15 at 19:38
  • You'd have to build an iterator from a document traversal of the xml tree. I don't see how you manage that without some hidden bookkeeping of the current root path, which essentially represents the same information as the recursion stack. Thus I wouldn't expect too much of a performance boost. If you don't depend on the document _order_ when collecting the xpaths, you could first process the nodes of each set of sibling elements and recurse in a second pass. That should avoid the need to remember the index into the collection. – collapsar Feb 17 '15 at 19:59
  • Why do you think that removing recursion would improve the function? – Michael Kay Feb 18 '15 at 06:09

1 Answers1

1

You can try using the Read function of the XmlReader. This will read all nodes one by one. Then it's just a about storing the previous nodes in a list. I don't know how well this will work with bigger xml.

    Dim xml As New Xml.XmlDocument
    xml.LoadXml((<?xml version="1.0" encoding="Windows-1252"?>
                 <!--XML Songs Database-->
                 <Songs>
                     <Song><Name>My Song 1.mp3</Name></Song>
                     <Song><Name>My Song 2.ogg</Name></Song>
                     <Song><Name>My Song 3.wav</Name></Song>
                 </Songs>).ToString)

    Dim xPathList As New List(Of String)

    Dim reader As XmlReader = xml.CreateNavigator.ReadSubtree
    Dim curNodes As New List(Of String)
    Dim curPath As String

    While reader.Read()
        If reader.NodeType = XmlNodeType.Element Then
            If curNodes.Count <= reader.Depth Then
                curNodes.Add(reader.Name)
            Else
                curNodes(reader.Depth) = reader.Name
            End If

            curPath = String.Join("/", curNodes.ToArray(), 0, reader.Depth + 1)

            If Not xPathList.Contains(curPath) Then
                xPathList.Add(curPath)
            End If
        End If
    End While

    For Each s As String In xPathList
        Console.WriteLine(s)
    Next
the_lotus
  • 12,668
  • 3
  • 36
  • 53
  • I said what I've said but really I'm not very worried about performance for this solution, I preffer aesthetic on this (sorry for my english). thanks for the help – ElektroStudios Feb 17 '15 at 20:17