5

I have some xml (in a file, but can be a string) which I need to parse, e.g.:

var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlText);

Given the following XML:

<foo>
    <cat>...</cat>
    <cat>...</cat>
    <dog>...</dog>
    <cat>...</cat>
    <dog>...</dog>
</foo>

I'm not sure how I can extract all the cat and dog elements and put them into the following output :-

<foo>
    <cat>...</cat>
    <cat>...</cat>
    ....
</foo>

and the same with dogs.

What's the trick to extract those nodes and put them into separate XMLDocuments.

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Pure.Krome
  • 84,693
  • 113
  • 396
  • 647

3 Answers3

8

Use Linq to XML as it has a much nicer API.

var doc = XElement.Parse(
@"<foo>
    <cat>...</cat>
    <cat>...</cat>
    <dog>...</dog>
    <cat>...</cat>
    <dog>...</dog>
</foo>");
doc.Descendants("dog").Remove();

doc now contains this:

<foo>
    <cat>...</cat>
    <cat>...</cat>
    <cat>...</cat>
</foo>

Edit:

While Linq to XML itself provides a nice API to work with XML, the power of Linq and its projection capabilities enables you to shape your data as you see fit.

Consider this, for example. Here the descendant elements are grouped by name and projected into a new root element which is then wrapped into a XDocument. Note that this creates an enumerable of XDocument.

var docs= 
    from d in doc.Descendants()
    group d by d.Name into g
    select new XDocument(
        new XElement("root", g)
    );

docs now contains:

<root>
    <cat>...</cat>
    <cat>...</cat>
    <cat>...</cat>
</root>
---
<root>
    <dog>...</dog>
    <dog>...</dog>
</root> 

Oh, by the way. The Descendants method goes through all descendant elements, use Elements if you only want the immediate child elements.

Here are the Linq to XML docs on MSDN

Mikael Östberg
  • 16,982
  • 6
  • 61
  • 79
  • Could u use linq and flip it .. so it only EXTRACTS the cat (or dog) elements? (i have multiple element's types to extract) – Pure.Krome Oct 01 '13 at 07:06
  • 1
    @Pure.Krome some experimentation on your part would answer that question faster than you typed it. – Gusdor Oct 01 '13 at 07:09
2

The easiest way will be to use XSLT and apply it on you XMLDocument in such way you won't modify your source and have as much outputs as you need.

The code for applying transform is

    XslCompiledTransform xslTransform = new XslCompiledTransform();
    StringWriter writer = new StringWriter();          
    xslTransform.Load("cat.xslt");
    xslTransform.Transform(doc.CreateNavigator(),null, writer);
    return writer.ToString();

And the simple cat.xslt is

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="foo">
        <xsl:copy>
            <xsl:copy-of select = "cat" />
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
Piotr Stapp
  • 19,392
  • 11
  • 68
  • 116
1

Since you are using XmlDocument: Load it twice from the same file and remove the unwanted nodes. Here is a link that shows you how: Removing nodes from an XmlDocument.

var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(xmlText);
XmlNode root = doc.DocumentElement;
nodeList = root.SelectNodes("//cat");

foreach (XmlNode node on nodeList)
{
  root.RemoveChild(node);
}
Community
  • 1
  • 1
meilke
  • 3,280
  • 1
  • 15
  • 31