2

I need an xpath that will find some text containing HTML line breaks <br/>. For example:

<ul>
   <li>ABC<br/>DEF</li>
   <li>XYZ<br/>NOP</li>
</ul>

Let's say I'm trying to find the li that contains ABC<br/><DEF>. I've tried the following:

$x("//li[normalize-space(.)='ABC DEF']")
$x("//li[text() ='ABC<br/>DEF']")
$x("//li[contains(., 'ABC DEF']")

But they return nothing. I saw this answer XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode but I couldn't figure out how to use it in my case.

Community
  • 1
  • 1
Ruan Mendes
  • 90,375
  • 31
  • 153
  • 217

2 Answers2

2

The following expression will get you close:

li[br[preceding-sibling::node()[1] = 'ABC']
     [starts-with(following-sibling::node()[1], 'DEF')]]

If you need to match only items where the text ends with ABC, it will be a little longer.

The following transform will select the first matching li:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" />

    <xsl:template match="/">
        <matches>
            <xsl:copy-of select="(//li[br[preceding-sibling::node()[1] = 'ABC']
                                                             [starts-with(following-sibling::node()[1], 'DEF')]
                                                         ])
                                                           [1]" />
        </matches>
    </xsl:template>

</xsl:stylesheet>

Input:

<ul>
   <li>ABC<br/>DEF</li>
   <li>XYZ<br/>NOP</li>
   <li><p>XYZ<br/>NOP</p></li>
   <li>ABC<br/>DEF</li>
   <li>DEF GHI</li>
   <li>ABC<![CDATA[<br/>]]>DEF</li>
</ul>

Output:

<?xml version="1.0" encoding="utf-8"?>
<matches>
    <li>ABC<br />DEF</li>
</matches>
harpo
  • 41,820
  • 13
  • 96
  • 131
  • Nice one. So it is just any element that contains br as a direct child. The only assumption is, that there is always text around, or that this is not really important for the OP. – Harald Jan 03 '14 at 22:00
  • This returns any element with a line break, but that is not good enough for me. In my example, I only want to select the first `li`, not based on index, but because it contains the text `ABC DEF` – Ruan Mendes Jan 03 '14 at 22:05
  • This is closer to what I want. I have a cucumber step that is looking for a button with label `Button 1` (as it looks on the web page), but the HTML is actually `Button
    1`. I think the only way to do this will be for my step definition to be something like `And I click on the 'Button|1' button` and I will be forced to use that pipe as the special character that indicates it contains a line break. Let me give it a few tries and I will mark this as correct.
    – Ruan Mendes Jan 03 '14 at 22:18
  • I've marked this as the correct answer, but I decided to find the button based on a CSS class name instead, this would be hard for others to maintain (if they know as little xpath as I do). Thanks a lot for a good solution, though – Ruan Mendes Jan 03 '14 at 22:30
-2
//li[br]

This should work. It means: select all li elements having br child

Michał Tabor
  • 2,441
  • 5
  • 23
  • 30