60

Could you please help me to find all the elements b which have the child element c in the example below?

<a>
    <b name = "b1"></b>
    <b name = "b2"><c/></b>
    <b name = "b3"></b>
</a>

The xpath query must return the b2 element

The second question is I want to combine 2 conditions: I want to get the element which have name = "b2" and has the element c But this syntax seems not to work: //b[@name='b2' and c]

nam
  • 3,542
  • 9
  • 46
  • 68
  • 1
    What exactly means "seems not to work"? Please, ask a new, separate question and provide complete (as small as possible) source XML document, the XPath expression used and the wanted result and the actual result you got. With the current XML document the XPath expression `//b[@name='b2' and c]` selects the second child of `a` -- exactly as it should. – Dimitre Novatchev Jun 11 '12 at 14:25

2 Answers2

70

Whenever the structure of the XML document is known, it is better to avoid using the // XPath pseudo-operator, as its use can result in big inefficiency (traversal of the whole document tree).

Therefore, I recomment this XPath expression for the provided XML document:

/*/b[c]

This selects any b element that is a child of the top element of the XML document and that has a child-element named c.

UPDATE: The OP asked a second question just minutes ago:

The second question is I want to combine 2 conditions: I want to get the element which have name = "b2" and has the element c But this syntax seems not to work: //b[@name='b2' and c]

The provided XPath expression does select exactly the wanted element.

Here is XSLT - based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="/*">
     <xsl:copy-of select="//b[@name='b2' and c]"/>
 </xsl:template>
</xsl:stylesheet>

When this transformation is applied on the provided XML document:

<a>
    <b name = "b1"></b>
    <b name = "b2"><c/></b>
    <b name = "b3"></b>
</a>

the XPath expression is evaluated and the correctly-selected element is copied to the output:

<b name="b2">
   <c/>
</b>
cletus
  • 616,129
  • 168
  • 910
  • 942
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • I use python to select the element If I use : root.findall("b[c]") The result is what I desire But if I use root.findall("b[@name='b2' and c]") I have the error "invalide predicate" Maybe I open another question for Python? – nam Jun 11 '12 at 14:39
  • @HOAINAMNGUYEN: Yes, opening another question seems the right thing to do -- this seems to be Python-related issue. – Dimitre Novatchev Jun 11 '12 at 16:14
  • i know that `//some-element-deep-within-the-dom` will search for **every** `//some-element-deep-within-the-dom`, but wouldn't it also be more flexible, less likely to break the script if, say, `//some-element-deep-within-the-dom` is moved somewhere else within the dom??? – oldboy Jul 24 '18 at 03:46
  • @Anthony, If there is a change in the XML document, then it is a good idea to revise all XPath expressions. Measures as the proposed one are not always safe and sufficient. There is a way to write XPath expressions that will always select what was intended, *regardless* of the change in the document -- simply don't use any names in the expression. – Dimitre Novatchev Jul 24 '18 at 15:31
  • i don't really understand your response. why is it not safe and sufficient? yes, of course i would select elements by their ID, but in many cases elements don't have IDs, so... anyways, there's clearly no foolproof way of always selecting a particular element. for instance, even if you don't select an element by name, if the structure of the document changes, then your selector would break – oldboy Jul 24 '18 at 16:35
  • @Antony, If the structure of the document has been changed, this is no longer the same XML document. One shouldn't be confident at all, without verifying, that pre-existing XPath expressions still select what they were meant to. As the semantics of the XML document changes, so does the semantics of the XPath expressions. This comment is *in general*. There might be specific instances of XML documents that allow restricted class of changes, without any (or significant) change of semantics. Expressions that always do what was meant: `/`, `/*`, `//*[not(*)]`, etc – Dimitre Novatchev Jul 24 '18 at 18:03
28

It should be as simple as

//b[c]

i.e. find a b anywhere that has a c child.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • Hello, now I want to combine 2 conditions: I want to get the element which have name = "b2" and has the element c But this syntax seems not to work: //b[@name='b2' and c] – nam Jun 11 '12 at 14:15
  • Works for me (using xsh in Perl). – choroba Jun 11 '12 at 20:16