1

I have the following XPath expression:

//a[@attribute='my-attribute']

When I have the following element in the HTML that XPath is searching, it matches as expected:

<a attribute="my-attribute">Some text</a>

But if there is an <svg> tag under that element, XPath returns no match:

<a attribute="my-attribute">
    <svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
        viewBox="0 0 24 24" focusable="false"></svg>
</a>

Why doesn't XPath match in this case? Is there a way I can modify my expression to make it match?

EDIT:

Apparently it has to do with the namespace on the <svg> element. Using the local-name() function makes it match in the XPath tester I'm using:

//*[local-name()='a' and @attribute='my-attribute']

However, this still doesn't match when running through Selenium WebDriver. Any idea of how to get this working with Selenium?

kjhughes
  • 106,133
  • 27
  • 181
  • 240
Andrew Mairose
  • 10,615
  • 12
  • 60
  • 102
  • Is there any specific question for us to answer or you are looking for generic ideas? – undetected Selenium Mar 27 '19 at 21:16
  • 1
    What is the selenium outcome with `//*[local-name()='a' and @attribute='my-attribute']` when you have only `a` tag without `svg`. – supputuri Mar 27 '19 at 21:22
  • 1
    It has nothing to do with *namespaces* and *child `svg` element* as long as you're trying to select anchor node – JaSON Mar 27 '19 at 21:40
  • @JaSON I would have thought the same thing, but removing the `xmlns` attribute from the `svg` tag causes the `//a[@attribute='my-attribute']` expression to match. – Andrew Mairose Mar 28 '19 at 13:07
  • As you can see, your problem cannot be reproduced in http://www.xpathtester.com/xpath/91e66f48ea100183e9e3b1958ceed7b5 – Alejandro Mar 28 '19 at 22:23

2 Answers2

3

You may be confused by how the XPath hosting environment is presenting the selected a elements.

Adding an svg element to the a element will not affect what's selected by

//a[@attribute='my-attribute']

In the case of

<a attribute="my-attribute">Some text</a>

the a element has a string value consisting of more than just white space characters, but with

<a attribute="my-attribute">
    <svg xmlns="http://www.w3.org/2000/svg" width="100%" height="100%"
        viewBox="0 0 24 24" focusable="false"></svg>
</a>

the a element has a string value that consists only of whites space, so for text results of the selection, you wouldn't see anything selected.

If you evaluate count(//a[@attribute='my-attribute']), you'll likely see the same results for both cases.

kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • From a generic perspective this answer should address OP's concerns. – undetected Selenium Mar 28 '19 at 06:13
  • @kjhughes I'm not trying to select the text of the `a` element. I'm trying to select the `a` element itself. Adding the `svg` element on its own doesn't affect what is selected by the `//a[@attribute='my-attribute']`, but it is the `xmlns` attribute on the `svg` element. Removing the `xmlns` attribute causes the xpath expression to match the `a` tag, but with the `xmlns` on the `svg` element, the expression yields no results. – Andrew Mairose Mar 28 '19 at 13:03
  • Adding an `svg` element ***does*** affect what's selected by that expression if the `svg` element has a `xmlns` attribute. I have verified this in an XPath tester. That is the entire reason for me asking this question. – Andrew Mairose Mar 28 '19 at 13:11
  • **No,** that would not happen. You've omitted some important detail, and the burden is on you to provide a true [mcve] that exhibits the impossible results you're describing. You will likely find your mistake in the course of producing such an MCVE because you won't be able to construct a ***complete*** and verifiable example that behaves as you describe. – kjhughes Mar 28 '19 at 13:19
0

Following is a possible solution in vb.net.

Public Class XmlNodeListWithNamespace
    ' see https://stackoverflow.com/questions/55385520/xpath-doesnt-match-when-desired-element-contains-child-elements
    ' @JaSON I would have thought the same thing,
    ' but removing the xmlns attribute from the svg tag
    ' causes the //a[@attribute='my-attribute'] expression to match.
    ' – Andrew Mairose
    ' Mar 28 at 13:07 "Asked 5 months ago  Active 5 months ago" implies 2019-03-28 13:07.

    ' Therefore, I first considered deleting all occurrences of
    '       xmlns="" and xmlns="http://www.w3.org/1999/xhtml"
    ' I did this using the following Replacement.
    ' gstrHtml = Regex.Replace(
    '            input:=gstrHtml,
    '            pattern:=" *xmlns=""[^""]*""",
    '            replacement:="",
    '            options:=RegexOptions.IgnoreCase
    '        )

    ' However, the solution below retains the namespace, while avoiding unsightly xpath strings.

    ''' <summary>
    ''' For a given xpath, returns an XmlNodeList, taking account of the xmlns namespace.
    ''' </summary>
    ''' <param name="oXmlDocument">The current XML document.</param>
    ''' <param name="xpath">A normal xpath string, without any namespace qualifier.</param>
    ''' <returns>The XmlNodeList for the given xpath.</returns>
    Public Shared Function NodeList(
        oXmlDocument As XmlDocument,
        xpath As String
    ) As XmlNodeList

        Dim strXpath As String = xpath

        ' Insert Namespace Qualifier.  For example, 
        '    "//pre"                                            becomes "//x:pre"
        '    "/html/body/form/div/pre"                          becomes "/x:html/x:body/x:form/x:div/x:pre"
        '    "//div[@id='nv_bot_contents']/pre"                 becomes "//x:div[@id='nv_bot_contents']/x:pre"
        '    "//div[@id='nv_bot_contents']/pre[@data-xxx='X2']" becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx='X2']"
        '    "//div[@id='nv_bot_contents']/pre[@data-xxx]"      becomes "//x:div[@id='nv_bot_contents']/x:pre[@data-xxx]"
        '    "//pre[@data-xxx]"                                 becomes "//x:pre[@data-xxx]"
        strXpath = Regex.Replace(
                        input:=strXpath,
                        pattern:="(/)(\w+)",
                        replacement:="$1x:$2"
                    )

        ' See https://stackoverflow.com/questions/40796231/how-does-xpath-deal-with-xml-namespaces/40796315#40796315
        Dim oXmlNamespaceManager As New XmlNamespaceManager(nameTable:=oXmlDocument.NameTable)
        oXmlNamespaceManager.AddNamespace("x", "http://www.w3.org/1999/xhtml")

        Dim oXmlNodeList As XmlNodeList = oXmlDocument.SelectNodes(
            xpath:=strXpath,
            nsmgr:=oXmlNamespaceManager
        )

        Return oXmlNodeList

    End Function

End Class

Sample invocation:

Dim oXmlNodeList As XmlNodeList =
            XmlNodeListWithNamespace.NodeList(
                oXmlDocument:=oXmlDocument,
                xpath:="//pre"
            )
Paul Margus
  • 41
  • 1
  • 3