2

In the XML example below, using a java parser how to keep the content under the tag AND followed by center and remove everything else? The tag might have other instances when is followed by other tags than center, and those has to be discarded.

<xml>
    <A> 
        <B>
        .
        .
        .

            <parameter>
                <parameterid>center</parameterid>
                <name>Center</name>
                <keyframe>
                    <when>1</when>
                    <value>
                        <horiz>100</horiz>
                        <vert>100</vert>
                    </value>
                </keyframe>
                <keyframe>
                <when>2</when>
                    <value>
                        <horiz>150</horiz>
                        <vert>150</vert>
                    </value>
                </keyframe>
            </parameter>
            <parameter>
                ...
            </parameter>
            <parameter>
                ...
            </parameter>
        .
        .
        .
        </B>
    </A>
</xml>

So the output will look like:

<parameter>
    <parameterid>center</parameterid>
    <name>Center</name>
    <keyframe>
        <when>1</when>
        <value>
            <horiz>100</horiz>
            <vert>100</vert>
        </value>
    </keyframe>
    <keyframe>
    <when>2</when>
        <value>
            <horiz>150</horiz>
            <vert>150</vert>
        </value>
    </keyframe>
</parameter>

Please advise. Thanks!

Cata Lin
  • 113
  • 1
  • 1
  • 7

2 Answers2

0

You can use Java Regexp to remove unneeded content, and then parse only needed part f.e.

String sourceXML = readFileToString("source.xml")
final Pattern pattern = Pattern.compile(".*(<parameter>.+</parameter>).*",Pattern.DOTALL);
Matcher matcher = pattern.matcher(sourceXML);
if (matcher.find()) {
   String xmlToParse = matcher.group(0);
   someDomOrSaxParser.parseFromString(xmlToParse)
}else 
   System.out.println("NO MATCH");
Alexey Sviridov
  • 3,360
  • 28
  • 33
0

This would be a good job for an XSLT stylesheet.

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="node()|@*">
    <xsl:apply-templates select="node()|@*"/>
  </xsl:template>

  <xsl:template match="parameter[parameterid='center']">
    <xsl:copy-of select="."/>
  </xsl:template>

</xsl:stylesheet>

applied to the input in the question, produces the following output:

<parameter>
   <parameterid>center</parameterid>
   <name>Center</name>
   <keyframe>
      <when>1</when>
      <value>
         <horiz>100</horiz>
         <vert>100</vert>
      </value>
   </keyframe>
   <keyframe>
      <when>2</when>
      <value>
         <horiz>150</horiz>
         <vert>150</vert>
      </value>
   </keyframe>
</parameter>

If you have any questions on using XSLT in Java, please take a look at this question.

Community
  • 1
  • 1
Daniel Haley
  • 51,389
  • 6
  • 69
  • 95
  • This works great. One more thing: I want to eliminate the excess of words, so we would have to tweak the output file a bit more so it would look like: " name = center , frame = 1 , horiz = 100, vert = 100 frame = 2, horiz = 150, vert = 150 " – Cata Lin Oct 24 '11 at 06:21