5

I'm looking for a xpath expression that filters out certain childs. A child must contain a CCC node with B in it.

Source:

<AAA>
    <BBB1>
        <CCC>A</CCC>
    </BBB1>       
    <BBB2>
        <CCC>A</CCC>
    </BBB2>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

This should be the result:


<AAA>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

Hopefully someone can help me.

Jos

Jos
  • 51
  • 1
  • 2

4 Answers4

4

XPath is a query language for XML documents. As such it can only select nodes from existing XML document(s) -- it cannot modify an XML document or create a new XML document.

Use XSLT in order to transform an XML document and create a new XML document from it.

In this particular case:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*/*[not(CCC = 'B')]"/>
</xsl:stylesheet>

when this transformation is applied on the provided XML document:

<AAA>
    <BBB1>
        <CCC>A</CCC>
    </BBB1>
    <BBB2>
        <CCC>A</CCC>
    </BBB2>
    <BBB3>
        <CCC>B</CCC>
    </BBB3>
    <BBB4>
        <CCC>B</CCC>
    </BBB4>
</AAA>

the wanted, correct result is produced:

<AAA>
   <BBB3>
      <CCC>B</CCC>
   </BBB3>
   <BBB4>
      <CCC>B</CCC>
   </BBB4>
</AAA>
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
1

In order to select all of the desired element and text nodes, use this XPATH:

//node()[.//CCC[.='B']
      or self::CCC[.='B']
      or self::text()[parent::CCC[.='B']]]

This could be achieved with a more simply/easily using XPATH with a modified identity transform XSLT:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes" />

    <!--Empty template for the content we want to redact -->
    <xsl:template match="*[CCC[not(.='B')]]" />

    <!--By default, copy all content forward -->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
Mads Hansen
  • 63,927
  • 12
  • 112
  • 147
  • +1 For XSLT solution. I would use `/*/*[not(CCC = 'B')]` as pattern. Meaning: _any element child of root element and not having a `CCC` element with `'B'` as string value_. –  Mar 03 '11 at 16:53
0

try this ,

         "//CCC[text() = 'B']"

It shall give all CCC nodes where the innertext is B.

Furqan Hameedi
  • 4,372
  • 3
  • 27
  • 34
0

If you want to get AAA, BBB3 and BBB4 you can use the following

//*[descendant::CCC[text()='B']]

If BBB3 and BBB4 only then

//*[CCC[text()='B']]
alpha-mouse
  • 4,953
  • 24
  • 36