Recently I needed to evaluate an XQuery on the Node of an HTML document. Basically, I needed to select all elements with an href attribute from the first child of the body element. I've added a slight example to explain:
<html>
<body>
<a href="http://www.google.be"/>
</body>
</html>
The desired extraction result is in this case obviously:
<a href="http://www.google.be"/>
My first idea was to use //body/*[1]//*[@href]
because:
//body
matches the body element, wherever it is/*[1]
matches the first child of the body element//*[@href]
matches all descendants or self of the current element
I figured that would work but on the example provided, the XQuery gives no results.
However, I read up a bit and found the following (source: http://www.keller.com/xslt/8/):
Alternate notation for "//": descendant-or-self::node()
So I changed my XQuery to //body/*[1]/descendant-or-self::node()[@href]
and this time, the results were correct.
My question: what is the difference between // and descendant-or-self::node()? What I found here (What's the difference between //node and /descendant::node in xpath?) and here (http://www.w3.org/TR/xpath/#axes) says:
//
is short for/descendant-or-self::node()/
. For example,//para
is short for/descendant-or-self::node()/child::para
.
Which leads me to conclude that //
and /descendant-or-self::node()
are not interchangeable (probably because of the terminating /
at the end?), but then can someone tell me if there is a shorthand for /descendant-or-self::node()
?