Is this XPath technique reliable in all situations?

Question

I am developing an application that accepts user-defined XPath expressions and employs them as part of its runtime operation.

However, I would like to be able to infer some additional data by programmatically manipulating the expression, and I am curious to know whether there are any situations in which this approach might fail.

Given any user-defined XPath expression that returns a node set, is it safe to wrap it in the XPath count() function to determine the number of nodes in the set:

count(user_defined_expression)

Similarly, is it safe to append an array index to the expression to extract one of the nodes in the set:

user_defined_expression[1]

score 3 · Accepted Answer · answered Feb 15 '11 at 12:06

3

Well an XPath expression (in XPath 1.0) can yield a node-set or a string or a number or a boolean and doing count(expression) only makes sense on any expression yielding a node-set.

As for adding a positional predicate, I think you might want to use parentheses around your expression i.e. to change /root/foo/bar into (/root/foo/bar)[1] as that way you select the first bar element in the node-set selected by /root/foo/bar while without them you would get /root/foo/bar[1] which would select the first bar child element of any foo child element of the root element.

answered Feb 15 '11 at 12:06

Martin Honnen

160,499
6
90
110

Glad to see that I raise the same possible problems :) – Flack Feb 15 '11 at 12:12
thanks, that was exactly the kind of unexpected complication that I was trying to avoid. – Tim Coulter Feb 15 '11 at 17:13
I'm glad to come across this comment: I was trying to figure out how to use position()=N or [N] as a way to find the Nth *match* as opposed to the Nth item among its in-context DOM siblings, and somehow I tried everything except parenthesis. – Darien Feb 22 '11 at 23:47

score 2 · Answer 2 · edited May 23 '17 at 12:20

2

Are you checking that such user-defined expressions always evaluate to node-set?

If yes, first Expr is ok. Datatype will be correct for fn:count

Second one is a lot trickier, with a lot of situations there predicate will overweight axis, for example. Check this answer for a simple analysis. It will be difficult to say, what a user really meant.

edited May 23 '17 at 12:20

Community

1
1

answered Feb 15 '11 at 11:59

Flack

5,862
2
23
27

thanks also. Your answer is very close the answer I accepted (and it's a pity I can't accept both), but Martin's approach is likely to cover the scenarios that my app will face. – Tim Coulter Feb 15 '11 at 17:17

score 1 · Answer 3 · answered Feb 15 '11 at 15:37

A more robust approach would be to convert the XPath expression to XQueryX, which is an XML representation of the abstract syntax tree; you can then do XQuery or XSLT transformations on this XML representation, and then convert back to a modified XPath (or XQuery) for evaluation.

However, this will still only give you the syntactic structure of the expression; if you want semantic information, such as the inferred static type of the result, you will probably have to poke inside an XPath process that exposes this information.

+1 Parsing the expression will bring more information. Little addition: an standard DOM Level 3 XPath implementation also gives the right tools to know **after the evaluation** the result data type among other things (node set length, i.e). — , Feb 15 '11 at 23:44

Is this XPath technique reliable in all situations?

3 Answers3