0

I can match sometext and othertext in

<br>
sometext
<br>
othertext

using xpath selector '//br/following-sibling::text()'

but if there is only whitespace after the <br> element

<br>

<br>
othertext

only the second match occurs. Is it possible to match whitespace as well?

I tried

//br/following-sibling::matches(., "\s+")

to attempt to match whitespace without success.

jela
  • 1,449
  • 3
  • 23
  • 30

1 Answers1

0

'matches' is to match regular-expressions, not to match nodes. And it can't be used with an axis specifier. You could use it as condition like:

//br/following-sibling::text()[matches(., "\s+")]

Or without regexs (might be faster depending on the implementation), checking if it is all whitespace and not the empty string:

//br/following-sibling::text()[(normalize-space(.) = "") and (. != "")]
BeniBela
  • 16,412
  • 4
  • 45
  • 52
  • I'm having some trouble getting this to work. I tried using `//br/following-sibling::text()[matches(., "\s+")]` on `
    sdfs

    sdfd

    ` without result. I noticed that if I tried `//br/following-sibling::text()[matches(., "\s+")] | //br/following-sibling::text()` this also fails to produce a match, even though `//br/following-sibling::text()` does find a match on `sdfd`, suggesting there is perhaps some syntax problem with my use of matches(). `//br/following-sibling::text()[(normalize-space(.) eq "") and (. ne "")]` likewise gave no match.
    – jela Nov 21 '12 at 18:40
  • in which program are you using it? matches only exists in XPath 2. And your example is not valid xml, so it need some kind of html parser – BeniBela Nov 21 '12 at 20:06
  • I'm using the PHP 5.3.8 version of Xpath. It seems to work without valid XML, but [this question](http://stackoverflow.com/questions/11023160/how-do-i-update-the-version-of-xpath-in-php) suggests that I'm using 1.0, and there is no way to fix this. Is there an XPath 1 way to solve my problem? – jela Nov 22 '12 at 21:35
  • Do you want to find text nodes that are only whitespace or that contain whitespace? the normalize-space version should work for former case in XPath 1 – BeniBela Nov 23 '12 at 11:27
  • I want to find text nodes containing only whitespace. However, `//br/following-sibling::text()[(normalize-space(.) eq "") and (. ne "")]` does not identify such a node. If I use `//br/following-sibling::text()` I get the expected results for non-whitespace-only nodes, but if I use the prior version to identify whitespace-only nodes, I get no result. I am testing with the text `
    sdfs

    sdfd

    `
    – jela Nov 24 '12 at 18:38
  • Perhaps you have to use = instead of eq, and != instead of ne in XPath 1 – BeniBela Nov 24 '12 at 22:25
  • I tried `//br/following-sibling::text()[(normalize-space(.) = "") and (. != "")]` and `//br/following-sibling::text()[(normalize-space(.) == "") and (. !== "")]` but these also produce no result. – jela Nov 25 '12 at 20:08
  • well, the version with a single = works for me (e.g. in http://www.xpathtester.com/test, after changing the html in valid xml). Perhaps your html parser trims the text nodes. – BeniBela Nov 25 '12 at 23:09