2

What is the easiest way for a Word macro to execute XPath expressions such as:

"string(/alpha/beta)" 

"not(string(/alpha/beta)='true')" 

which should return string and boolean respectively? (as opposed to an xml node or node list)

I want to avoid DLLs which won't already be present on a machine running Office 2007 or 2010.

Function selectSingleNode(queryString As String) returns an IXMLDOMNode, so that won't do.

In other words, something similar to .NET's xpathnavigator.evaluate [1], which does this?

[1] http://msdn.microsoft.com/en-us/library/2c16b7x8.aspx

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
JasonPlutext
  • 15,352
  • 4
  • 44
  • 84

1 Answers1

4

You can use an XSL transform to evaluate XPath expressions, specifically xsl:value-of.

I wrote an Evaluate function that works on this principle. It creates a XSL stylesheet in memory which contains an XSL template that will take an XPath expression, evaluate it, and return a new XML document that contains the result in a <result> node. It checks to make sure that value-of returned something (and throws an error if not), and if so, it converts the result of the XPath expression to one of the following data types: Long, Double, Boolean, or String.

Here are some tests I used to exercise the code. I used the books.xml file from the MSDN page you linked to (you'll have to change the path to books.xml if you want to run these tests).

Public Sub Test_Evaluate()

    Dim doc As New DOMDocument
    Dim value As Variant

    doc.async = False
    doc.Load "C:\Development\StackOverflow\XPath Evaluation\books.xml"

    Debug.Assert (doc.parseError.errorCode = 0)

    ' Sum of book prices should be a Double and equal to 30.97
    '
    value = Evaluate(doc, "sum(descendant::price)")
    Debug.Assert TypeName(value) = "Double"
    Debug.Assert value = 30.97

    ' Title of second book using text() selector should be "The Confidence Man"
    '
    value = Evaluate(doc, "descendant::book[2]/title/text()")
    Debug.Assert TypeName(value) = "String"
    Debug.Assert value = "The Confidence Man"

    ' Title of second book using string() function should be "The Confidence Man"
    '
    value = Evaluate(doc, "string(/bookstore/book[2]/title)")
    Debug.Assert TypeName(value) = "String"
    Debug.Assert value = "The Confidence Man"

    ' Total number of books should be 3
    '
    value = Evaluate(doc, "count(descendant::book)")
    Debug.Assert TypeName(value) = "Long"
    Debug.Assert value = 3

    ' Title of first book should not be "The Great Gatsby"
    '
    value = Evaluate(doc, "not(string(/bookstore/book[1]/title))='The Great Gatsby'")
    Debug.Assert TypeName(value) = "Boolean"
    Debug.Assert value = False

    ' Genre of second book should be "novel"
    '
    value = Evaluate(doc, "string(/bookstore/book[2]/attribute::genre)='novel'")
    Debug.Assert TypeName(value) = "Boolean"
    Debug.Assert value = True

    ' Selecting a non-existent node should generate an error
    '
    On Error Resume Next

    value = Evaluate(doc, "string(/bookstore/paperback[1])")
    Debug.Assert Err.Number = vbObjectError

    On Error GoTo 0

End Sub

And here is the code for the Evaluate function (the IsLong function is a helper function to make the data type conversion code a little more readable):


Note: As barrowc mentions in the comments, you can be explicit about which version of MSXML you want to use by replacing DOMDocument with a version-specific class name, such as DOMDocument30 (MSXML3) or DOMDocument60 (MSXML6). The code as written will default to using MSXML3, which is currently more widely-deployed, but MSXML6 has better performance and, being the latest version, is the one Microsoft currently recommends.

See the question Which version of MSXML should I use? for more information about the different versions of MSXML.


Public Function Evaluate(ByVal doc As DOMDocument, ByVal xpath As String) As Variant

    Static styleDoc As DOMDocument
    Dim valueOf As IXMLDOMElement
    Dim resultDoc As DOMDocument
    Dim result As Variant

    If styleDoc Is Nothing Then

        Set styleDoc = New DOMDocument

        styleDoc.loadXML _
            "<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>" & _
                "<xsl:template match='/'>" & _
                    "<result>" & _
                        "<xsl:value-of />" & _
                    "</result>" & _
                "</xsl:template>" & _
            "</xsl:stylesheet>"

    End If

    Set valueOf = styleDoc.selectSingleNode("//xsl:value-of")
    valueOf.setAttribute "select", xpath

    Set resultDoc = New DOMDocument
    doc.transformNodeToObject styleDoc, resultDoc

    If resultDoc.documentElement.childNodes.length = 0 Then
        Err.Raise vbObjectError, , "Expression '" & xpath & "' returned no results."
    End If

    result = resultDoc.documentElement.Text

    If IsLong(result) Then
        result = CLng(result)
    ElseIf IsNumeric(result) Then
        result = CDbl(result)
    ElseIf result = "true" Or result = "false" Then
        result = CBool(result)
    End If

    Evaluate = result

End Function

Private Function IsLong(ByVal value As Variant) As Boolean

    Dim temp As Long

    If Not IsNumeric(value) Then
        Exit Function
    End If

    On Error Resume Next

    temp = CLng(value)

    If Not Err.Number Then
        IsLong = (temp = CDbl(value))
    End If

End Function
Community
  • 1
  • 1
Mike Spross
  • 7,999
  • 6
  • 49
  • 75
  • That's exactly how I thought I might have to do it. Nice to have it confirmed so quickly, and a very pleasant surprise to have the implementation delivered on a platter :-) – JasonPlutext Nov 07 '10 at 10:37
  • 1
    @plutext: No problem. It was an interesting question, and something I might use in an upcoming project, so I thought I'd write some code to see if it was doable. :-) – Mike Spross Nov 08 '10 at 03:36
  • +1 but with a slight caution on using `DOMDocument` rather than `DOMDocument60`. In this context, `DOMDocument` almost certainly equals `DOMDocument30` - see http://msdn.microsoft.com/en-us/library/ms757837%28v=VS.85%29.aspx - and see also http://stackoverflow.com/questions/951804/which-version-of-msxml-should-i-use – barrowc Nov 09 '10 at 01:51
  • @barrowc: I assume there would be better performance with large XML documents with MSXML6, but isn't MSXML3 more widely deployed? – Mike Spross Nov 10 '10 at 06:01
  • 1
    @Mike Spross: MSXML3 is indeed more widely deployed (as standard from Win2K SP4 onwards) than MSXML6 (as standard from Vista onwards; can be downloaded for Win2K and later). You can explicitly specify `DOMDocument30` if you want to work with MSXML3 though. I think using `DOMDocument` can lead to confusion if people don't realise that this means `DOMDocument30` and does not mean `DOMDocumentxx` where xx is the most recent MSXML version installed – barrowc Nov 10 '10 at 23:29
  • @barrowc: Gotcha. I'll make a note in my answer. – Mike Spross Nov 11 '10 at 00:28