Suppose you have this XML document:
<html>
<div id="test">some<b>bold</b> or <i>italic</i> text</div>
<div id="test">and again<b> bold text</b><i>and italic text</i></div>
</html>
Then just use:
string(/*/div[1])
The result of evaluating this XPath expression is:
somebold or italic text
Similarly:
string(/*/div[2])
when evaluated produces:
and again bold textand italic text
In case you want to delimit each text node with space, this cannot be achieved with a single XPath 1.0 expression (can be done with a single XPath 2.0 expression). Instead, you will need to evaluate:
/*/div[1]//text()
This selects (in a list or array structure, depending on your programming language) all text node descendants of /*/div[1]
:
"some" "bold" " or " "italic" " text".
Similarly:
/*/div[2]//text()
selects (in a list or array structure, depending on your programming language) all text node descendants of /*/div[2]
:
Now, using your programming language, you have to concatenate these with intermediate space to produce the final wanted result.