12

Consider simple XML document:

<html><body>
<table>
<tr><td>   Item 1</td></tr>
<tr><td>  Item 2</td></tr>
</table>
</body></html>

Using XPath /html/body/table/tr/td/text() we will get

["   Item 1", "  Item 2"]. 

Is it possible to trim white space, for example using normalize-space() function to get this?

["Item 1", "Item 2"]

normalize-space(/html/body/table/tr/td/text()) yields trimmed contents of only the first td tag ["Item 1"]

Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
Victor Olex
  • 1,458
  • 1
  • 13
  • 28

2 Answers2

10

Using XPath "/html/body/table/tr/td/text()" we will get [" Item 1", " Item 2"].

Is it possible to trim white space for example using normalize-space() function to get ["Item 1", "Item 2"]?

Not in XPath 1.0.

In Xpath 2.0 this is simple:

/html/body/table/tr/td/text()/normalize-space(.) 

In XPath 2.0 a location step of an XPath expression may be a function reference. This is used in the expression above to produce a sequence of xs:string items, each of which is the result of applying normalize-space() on the context node (any node selected by the subexpression that precedes the last location step).

Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Where to verify **XPath 1.0** or **XPath2.0**? – zionpi Apr 18 '18 at 11:29
  • @zionpi Read the documentation of your Xpath API (or host provider, or language reference). Also, if the evaluation of the XPath 2.0 expression fails, then you'll know the XPath processor you are using supports only XPath 1.0 – Dimitre Novatchev Apr 18 '18 at 15:59
  • Not necessarily- it's possible you might have simply mis-typed the XPath 2.0 expression, and it's failing for that reason! It's always worth ruling out user error before asserting that it's the XSLT processor that's at fault. Copied verbatim from the expression in this answer however, it's reasonable to assume that's not the case. – Flynn1179 Apr 23 '18 at 08:31
  • @Flynn1179 Yes, Thanks for the additional clarification. In my comment, by "if the evaluation of the XPath 2.0 expression fails" I meant the expression in the answer. We can paraphrase this: If the evaluation of a *syntactically valid* XPath 2.0 fails, then you know, the XPath processor you are using doesn't support XPath 2.0 – Dimitre Novatchev Apr 23 '18 at 14:38
2

If you're using XPath 2.0 you can just use /html/body/table/tr/td/normalize-space(.).

If you're stuck with XPath 1.0, I don't believe this is possible. You'll just have to loop over the resulting strings and normalize them.

porges
  • 30,133
  • 4
  • 83
  • 114