27

I was writing an XPath expression, and I had a strange error which I fixed, but what is the difference between the following two XPath expressions?

"//td[starts-with(normalize-space()),'Posted Date:')]"

and

"//td[starts-with(normalize-space(text()),'Posted Date:')]"  

Mainly, what will the first XPath expression catch? Because I was getting a lot of strange results. So what does the text() make in the matching? Also, is there is a difference if I said normalize-space() & normalize-space(.)?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Karim
  • 6,113
  • 18
  • 58
  • 83
  • 1
    From my own testing `normalize-space()` and `normalize-space(.)` have the same effect. – CJ7 Jan 03 '17 at 22:32

1 Answers1

65

Well, the real question is: what's the difference between . and text()?

. is the current node. And if you use it where a string is expected (i.e. as the parameter of normalize-space()), the engine automatically converts the node to the string value of the node, which for an element is all the text nodes within the element concatenated. (Because I'm guessing the question is really about elements.)

text() on the other hand only selects text nodes that are the direct children of the current node.

So for example given the XML:

<a>Foo
    <b>Bar</b>
  lish
</a>

and assuming <a> is your current node, normalize-space(.) will return Foo Bar lish, but normalize-space(text()) will fail, because text() returns a nodeset of two text nodes (Foo and lish), which normalize-space() doesn't accept.

To cut a long story short, if you want to normalize all the text within an element, use .. If you want to select a specific text node, use text(), but always remember that despite its name, text() returns a nodeset, which is only converted to a string automatically if it has a single element.

biziclop
  • 48,926
  • 12
  • 77
  • 104
  • 1
    Actually `normalize-space(text())` will return an empty string, because it takes the text in the root. `normalize-space(//text())` will return _Foo_, because it transforms your NodeSet by taking the first node and converting it to a String and running `normalize-space` on that. – Matthijs Bierman Aug 25 '11 at 14:41
  • @Matthijs Bierman Have you tried it? I have and it works exactly as I said. (In Xpath 2.0, I shall add and assuming that the context node is the `` element.) – biziclop Aug 25 '11 at 14:55
  • Yes, I have (I wasn't sure). But I tried in XPath 1.0. Standard JAXP, but with Xerces 2.11.0 :). – Matthijs Bierman Aug 29 '11 at 13:59
  • @Matthijs Bierman Weird, I'll experiment a bit with that. I know that my answer definitely was for Xpath 2.0 only. – biziclop Aug 29 '11 at 20:36
  • How should I write this `//text()/normalize-space(.)`,although my one in wrong! – Arup Rakshit Aug 31 '13 at 19:12
  • 7
    try text()[normalize-space()] – mgibas Apr 09 '14 at 05:45
  • 1
    @MatthijsBierman where did you try? which website? I am asking so that we can try it hands on for better understanding. – paul Oct 02 '15 at 09:27
  • 1
    I ended up with something like this normalize-space(s:td[2]/s:a[1]/text()[1]) And that worked like a charm – slott Aug 02 '16 at 17:09