18

I have a huge bunch of XML files with the following structure:

<Stuff1>
  <Content>someContent</name>
  <type>someType</type>
</Stuff1>
<Stuff2>
  <Content>someContent</name>
  <type>someType</type>
</Stuff2>
<Stuff3>
  <Content>someContent</name>
  <type>someType</type>
</Stuff3>
...
...

I need to change the each of the "Content" node names to StuffxContent; basically prepend the parent node name to the content node's name.

I planned to use the XMLDocument class and figure out a way, but thought I would ask if there were any better ways to do this.

Yi Jiang
  • 49,435
  • 16
  • 136
  • 136
sundeep
  • 4,048
  • 7
  • 25
  • 22

8 Answers8

62

(1.) The [XmlElement / XmlNode].Name property is read-only.

(2.) The XML structure used in the question is crude and could be improved.

(3.) Regardless, here is a code solution to the given question:

String sampleXml =
  "<doc>"+
    "<Stuff1>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff1>"+
    "<Stuff2>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff2>"+
    "<Stuff3>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff3>"+
  "</doc>";

XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(sampleXml);

XmlNodeList stuffNodeList = xmlDoc.SelectNodes("//*[starts-with(name(), 'Stuff')]");

foreach (XmlNode stuffNode in stuffNodeList)
{
    // get existing 'Content' node
    XmlNode contentNode = stuffNode.SelectSingleNode("Content");

    // create new (renamed) Content node
    XmlNode newNode = xmlDoc.CreateElement(contentNode.Name + stuffNode.Name);

    // [if needed] copy existing Content children
    //newNode.InnerXml = stuffNode.InnerXml;

    // replace existing Content node with newly renamed Content node
    stuffNode.InsertBefore(newNode, contentNode);
    stuffNode.RemoveChild(contentNode);
}

//xmlDoc.Save

PS: I came here looking for a nicer way of renaming a node/element; I'm still looking.

juergen d
  • 201,996
  • 37
  • 293
  • 362
Omar
  • 1,493
  • 12
  • 14
  • 4
    It's a shame that someone with 51 rep understands this better than someone with 31k rep. +1 for you, even if it is a slightly more complex solution than I was hoping for. – Chris Jun 17 '11 at 18:06
  • 4
    It does not affect the asker's example, but for completeness your routine should not just copy across InnerXml, it should also copy any attributes: for (int i = contentNode.Attributes.Count - 1; i >= 0; i --) { newNode.Attributes.Prepend((XmlAttribute)contentNode.RemoveAttributeAt(i)); } – Oliver Bock Aug 25 '11 at 07:22
  • I'm guessing this will not work if you try to change the name of the documentElement. Which is what I'm looking for. – nl-x Apr 24 '15 at 08:57
  • If you're looking to rename the document element, something like this might work for you: XmlDocument oldDoc = new XmlDocument(); oldDoc.LoadXml(myOldXmlDoc); string strNewXml= "" + oldDoc.DocumentElement.InnerXml + ""; XmlDocument newDoc= new XmlDocument(); newDoc.LoadXml(strNewXml); – hobnob Aug 11 '15 at 09:59
3

I used this method to rename the node:

/// <summary>
/// Rename Node
/// </summary>
/// <param name="parentnode"></param>
/// <param name="oldname"></param>
/// <param name="newname"></param>
private static void RenameNode(XmlNode parentnode, string oldChildName, string newChildName)
{
    var newnode = parentnode.OwnerDocument.CreateNode(XmlNodeType.Element, newChildName, "");
    var oldNode = parentnode.SelectSingleNode(oldChildName);

    foreach (XmlAttribute att in oldNode.Attributes)
        newnode.Attributes.Append(att);
    foreach (XmlNode child in oldNode.ChildNodes)
        newnode.AppendChild(child);

    parentnode.ReplaceChild(newnode, oldNode);
}
Acubo
  • 120
  • 6
2

The easiest way I found to rename a node is:

xmlNode.InnerXmL = newNode.InnerXml.Replace("OldName>", "NewName>")

Don't include the opening < to ensure that the closing </OldName> tag is renamed as well.

rekire
  • 47,260
  • 30
  • 167
  • 264
Marc
  • 61
  • 1
  • 1
    Probably doesn't work with empty element tags, e.g. ````. But renames nested elements with the same name. And elements with the old name as a suffix, e.g ``..``. :-( – Chaquotay Jan 18 '13 at 13:05
  • 3
    There is not even space here to describe all the things wrong with this approach. Maybe read [this question](http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg). – Nick Whaley Jul 24 '13 at 17:09
  • 1
    Maybe is a "quick and dirty" aproach but it has just save me a bunch of time of coding – Asier Sánchez Rodríguez Aug 11 '17 at 11:27
1

I am not an expert in XML, and in my case I just needed to make all tag names in a HTML file to upper case, for further manipulation in XmlDocument with GetElementsByTagName. The reason I needed upper case was that for XmlDocument the tag names are case sensitive (since it is XML), and I could not guarantee that my HTML-file had consistent case in the tag names.

So I solved it like this: I used XDocument as an intermediate step, where you can rename elements (i.e. the tag name), and then loaded that into a XmlDocument. Here is my VB.NET-code (the C#-coding will be very similar).

Dim x As XDocument = XDocument.Load("myFile.html")
For Each element In x.Descendants()
  element.Name = element.Name.LocalName.ToUpper()
Next
Dim x2 As XmlDocument = New XmlDocument()
x2.LoadXml(x.ToString())

For my purpose it worked fine, though I understand that in certain cases this might not be a solution if you are dealing with a pure XML-file.

Magnus
  • 1,584
  • 19
  • 14
0

I'll answer the higher question: why are you trying this using XmlDocument?

I Think the best way to accomplish what you aim is a simple XSLT file
that match the "CONTENTSTUFF" node and output a "CONTENT" node...

don't see a reason to get such heavy guns...

Either way, If you still wish to do it C# Style,
Use XmlReader + XmlWriter and not XmlDocument for memory and speed purposes. XmlDocument store the entire XML in memory, and makes it very heavy for Traversing once...

XmlDocument is good if you access the element many times (not the situation here).

Tomer W
  • 3,395
  • 2
  • 29
  • 44
0

Perhaps a better solution would be to iterate through each node, and write the information out to a new document. Obviously, this will depend on how you will be using the data in future, but I'd recommend the same reformatting as FlySwat suggested...

<stuff id="1">
    <content/>
</stuff>

I'd also suggest that using the XDocument that was recently added would be the best way to go about creating the new document.

aemus
  • 713
  • 4
  • 14
ZombieSheep
  • 29,603
  • 12
  • 67
  • 114
-4

Load it in as a string and do a replace on the whole lot..

    String sampleXml =
  "<doc>"+
    "<Stuff1>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff1>"+
    "<Stuff2>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff2>"+
    "<Stuff3>"+
      "<Content>someContent</Content>"+
      "<type>someType</type>"+
    "</Stuff3>"+
  "</doc>"; 

    sampleXml = sampleXml.Replace("Content","StuffxContent")
timothy
  • 568
  • 8
  • 9
  • Bad solution, and the 'x' was clearly a placeholder in the initial question to refer to whichever numbered Stuff node was this Content node's parent. – nacitar sevaht May 10 '19 at 18:47
-33

The XML you have provided shows that someone completely misses the point of XML.

Instead of having

<stuff1>
   <content/>
</stuff1>

You should have:/

<stuff id="1">
    <content/>
</stuff>

Now you would be able to traverse the document using Xpath (ie, //stuff[id='1']/content/) The names of nodes should not be used to establish identity, you use attributes for that.

To do what you asked, load the XML into an xml document, and simply iterate through the first level of child nodes renaming them.

PseudoCode:

foreach (XmlNode n in YourDoc.ChildNodes)
{        
    n.ChildNode[0].Name = n.Name + n.ChildNode[0].Name;
}

YourDoc.Save();

However, I'd strongly recommend you actually fix the XML so that it is useful, instead of wreck it further.

FlySwat
  • 172,459
  • 74
  • 246
  • 311
  • Thanks for your answer! The XML has a very different (and more complicated) schema than what I showed in my question. I was trying to simplify it for the question :). – sundeep Jan 24 '09 at 02:56
  • 6
    I'm failing to understand why this is marked as correct, since as DeepBlue says below, the Name property is read only. Fascinating that it received 11 upvotes... – Matthew Talbert Nov 09 '09 at 07:33
  • I agree. Have you ever had to work with any XML from Apple? It's all like this and an utter pain to parse... – ZombieSheep Nov 23 '09 at 11:14
  • @ZombieSheep...that's not XML that's plist...p for property or pain...whatever you like more – Scoregraphic Nov 23 '09 at 11:18
  • 1
    @sundeep You should really reconsider to unmark this answer as correct. It just isn't – nl-x Apr 24 '15 at 08:52
  • The answer is incorrect, even though it claims it is only pseudo code. It needs a fix as Name is readonly – Nic Nov 12 '15 at 14:42
  • 1
    **If you have 20k, know the technology and know that the answer is essentially bad, please vote to delete.** – peterh Jun 16 '19 at 12:04