119

When I build XML up from scratch with XmlDocument, the OuterXml property already has everything nicely indented with line breaks. However, if I call LoadXml on some very "compressed" XML (no line breaks or indention) then the output of OuterXml stays that way. So ...

What is the simplest way to get beautified XML output from an instance of XmlDocument?

Neil C. Obremski
  • 18,696
  • 24
  • 83
  • 112

12 Answers12

228

Based on the other answers, I looked into XmlTextWriter and came up with the following helper method:

static public string Beautify(this XmlDocument doc)
{
    StringBuilder sb = new StringBuilder();
    XmlWriterSettings settings = new XmlWriterSettings
    {
        Indent = true,
        IndentChars = "  ",
        NewLineChars = "\r\n",
        NewLineHandling = NewLineHandling.Replace
    };
    using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
        doc.Save(writer);
    }
    return sb.ToString();
}

It's a bit more code than I hoped for, but it works just peachy.

Bakudan
  • 19,134
  • 9
  • 53
  • 73
Neil C. Obremski
  • 18,696
  • 24
  • 83
  • 112
  • 6
    You might even consider creating your utility method as an extension method to the XmlDocument class. – Oppositional Dec 02 '08 at 15:30
  • 6
    Oddly enough, for me this does nothing except setting the xml header's encoding to UTF-16. Strangely enough, it does this even if I explicitly set `settings.Encoding = Encoding.UTF8;` – Nyerguds May 13 '13 at 13:43
  • 3
    The encoding problem can be solved by using a `MemoryStream` + `StreamWriter` with a specified encoding instead of the `StringBuilder`, and getting the text with `enc.GetString(memstream.GetBuffer(), 0, (int)memstream.Length);`. The end result is still in no way formatted, though. Could it be related that I'm starting from a read document which already has formatting? I just want my new nodes to be formatted as well. – Nyerguds May 13 '13 at 14:09
  • 2
    I'm tempted to modify the `"\r\n"`to `Environment.Newline`. – Pharap Dec 04 '15 at 06:34
  • You don't need the setting options about new line. The ones about indentation seem to be enough. Two spaces are also the default, so you'd simply need `Indent = true`. I prefer tabs though so I also need `IndentChars = "\t"`. (Tabs save space, too.) – ygoe Dec 04 '15 at 14:10
  • 3
    `doc.PreserveWhitespace` should not be set to true. Otherwise it fails if it contains already partial indentation. – Master DJon Jan 30 '19 at 19:29
52

As adapted from Erika Ehrli's blog, this should do it:

XmlDocument doc = new XmlDocument();
doc.LoadXml("<item><name>wrench</name></item>");
// Save the document to a file and auto-indent the output.
using (XmlTextWriter writer = new XmlTextWriter("data.xml", null)) {
    writer.Formatting = Formatting.Indented;
    doc.Save(writer);
}
DocMax
  • 12,094
  • 7
  • 44
  • 44
  • 10
    the closing of the `using` statement will automatically close the writer when `Dispose()` is called. – Tyler Lee Sep 28 '15 at 17:27
  • 3
    For me, this only indents one line. I still have dozens of other lines that are not indented. – C.J. Nov 09 '16 at 16:06
43

Or even easier if you have access to Linq

try
{
    RequestPane.Text = System.Xml.Linq.XElement.Parse(RequestPane.Text).ToString();
}
catch (System.Xml.XmlException xex)
{
            displayException("Problem with formating text in Request Pane: ", xex);
}
JFK
  • 1,527
  • 16
  • 21
  • very nice! *thumbs up* advantage over accepted answer is that it won't produce an XML comment so works better for an XML fragment – Umar Farooq Khawaja Oct 13 '14 at 13:54
  • 3
    Oddly, this removes the `` and the ` ` from the XML. OK for a fragment, but not desirable for a full document. – Jesse Chisholm Oct 06 '15 at 22:01
  • This is the only way which worked for me. All of the other methods using xmltextwriter, Formatting = Formatting.Indented and XmlWriterSettings does NOT reformat the text, but this method does. – kexx Oct 23 '16 at 19:21
22

A shorter extension method version

public static string ToIndentedString( this XmlDocument doc )
{
    var stringWriter = new StringWriter(new StringBuilder());
    var xmlTextWriter = new XmlTextWriter(stringWriter) {Formatting = Formatting.Indented};
    doc.Save( xmlTextWriter );
    return stringWriter.ToString();
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
Jonathan Mitchem
  • 953
  • 3
  • 13
  • 18
14

If the above Beautify method is being called for an XmlDocument that already contains an XmlProcessingInstruction child node the following exception is thrown:

Cannot write XML declaration. WriteStartDocument method has already written it.

This is my modified version of the original one to get rid of the exception:

private static string beautify(
    XmlDocument doc)
{
    var sb = new StringBuilder();
    var settings =
        new XmlWriterSettings
            {
                Indent = true,
                IndentChars = @"    ",
                NewLineChars = Environment.NewLine,
                NewLineHandling = NewLineHandling.Replace,
            };

    using (var writer = XmlWriter.Create(sb, settings))
    {
        if (doc.ChildNodes[0] is XmlProcessingInstruction)
        {
            doc.RemoveChild(doc.ChildNodes[0]);
        }

        doc.Save(writer);
        return sb.ToString();
    }
}

It works for me now, probably you would need to scan all child nodes for the XmlProcessingInstruction node, not just the first one?


Update April 2015:

Since I had another case where the encoding was wrong, I searched for how to enforce UTF-8 without BOM. I found this blog post and created a function based on it:

private static string beautify(string xml)
{
    var doc = new XmlDocument();
    doc.LoadXml(xml);

    var settings = new XmlWriterSettings
    {
        Indent = true,
        IndentChars = "\t",
        NewLineChars = Environment.NewLine,
        NewLineHandling = NewLineHandling.Replace,
        Encoding = new UTF8Encoding(false)
    };

    using (var ms = new MemoryStream())
    using (var writer = XmlWriter.Create(ms, settings))
    {
        doc.Save(writer);
        var xmlString = Encoding.UTF8.GetString(ms.ToArray());
        return xmlString;
    }
}
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
  • it will not work if you put cdata section inside parent node and before child node – Sasha Bond May 01 '18 at 18:45
  • 2
    MemoryStream doesn't seem to be needed, at least on my side. In settings I set : `Encoding = Encoding.UTF8` and `OmitXmlDeclaration = true` – Master DJon Jan 31 '19 at 19:55
7
XmlTextWriter xw = new XmlTextWriter(writer);
xw.Formatting = Formatting.Indented;
benPearce
  • 37,735
  • 14
  • 62
  • 96
6
    public static string FormatXml(string xml)
    {
        try
        {
            var doc = XDocument.Parse(xml);
            return doc.ToString();
        }
        catch (Exception)
        {
            return xml;
        }
    }
rewrew
  • 69
  • 1
  • 1
  • The answer below could definitely do with some explanation however it worked for me and is much simpler than the other solutions. – CarlR Jan 19 '15 at 13:22
  • It seems you need to import the system.link.XML assembly for this to work on PS 3. – CarlR Feb 13 '15 at 21:52
2

A simple way is to use:

writer.WriteRaw(space_char);

Like this sample code, this code is what I used to create a tree view like structure using XMLWriter :

private void generateXML(string filename)
        {
            using (XmlWriter writer = XmlWriter.Create(filename))
            {
                writer.WriteStartDocument();
                //new line
                writer.WriteRaw("\n");
                writer.WriteStartElement("treeitems");
                //new line
                writer.WriteRaw("\n");
                foreach (RootItem root in roots)
                {
                    //indent
                    writer.WriteRaw("\t");
                    writer.WriteStartElement("treeitem");
                    writer.WriteAttributeString("name", root.name);
                    writer.WriteAttributeString("uri", root.uri);
                    writer.WriteAttributeString("fontsize", root.fontsize);
                    writer.WriteAttributeString("icon", root.icon);
                    if (root.children.Count != 0)
                    {
                        foreach (ChildItem child in children)
                        {
                            //indent
                            writer.WriteRaw("\t");
                            writer.WriteStartElement("treeitem");
                            writer.WriteAttributeString("name", child.name);
                            writer.WriteAttributeString("uri", child.uri);
                            writer.WriteAttributeString("fontsize", child.fontsize);
                            writer.WriteAttributeString("icon", child.icon);
                            writer.WriteEndElement();
                            //new line
                            writer.WriteRaw("\n");
                        }
                    }
                    writer.WriteEndElement();
                    //new line
                    writer.WriteRaw("\n");
                }

                writer.WriteEndElement();
                writer.WriteEndDocument();

            }

        }

This way you can add tab or line breaks in the way you are normally used to, i.e. \t or \n

Munim
  • 2,626
  • 1
  • 19
  • 28
2

When implementing the suggestions posted here, I had trouble with the text encoding. It seems the encoding of the XmlWriterSettings is ignored, and always overridden by the encoding of the stream. When using a StringBuilder, this is always the text encoding used internally in C#, namely UTF-16.

So here's a version which supports other encodings as well.

IMPORTANT NOTE: The formatting is completely ignored if your XMLDocument object has its preserveWhitespace property enabled when loading the document. This had me stumped for a while, so make sure not to enable that.

My final code:

public static void SaveFormattedXml(XmlDocument doc, String outputPath, Encoding encoding)
{
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.IndentChars = "\t";
    settings.NewLineChars = "\r\n";
    settings.NewLineHandling = NewLineHandling.Replace;

    using (MemoryStream memstream = new MemoryStream())
    using (StreamWriter sr = new StreamWriter(memstream, encoding))
    using (XmlWriter writer = XmlWriter.Create(sr, settings))
    using (FileStream fileWriter = new FileStream(outputPath, FileMode.Create))
    {
        if (doc.ChildNodes.Count > 0 && doc.ChildNodes[0] is XmlProcessingInstruction)
            doc.RemoveChild(doc.ChildNodes[0]);
        // save xml to XmlWriter made on encoding-specified text writer
        doc.Save(writer);
        // Flush the streams (not sure if this is really needed for pure mem operations)
        writer.Flush();
        // Write the underlying stream of the XmlWriter to file.
        fileWriter.Write(memstream.GetBuffer(), 0, (Int32)memstream.Length);
    }
}

This will save the formatted xml to disk, with the given text encoding.

Nyerguds
  • 5,360
  • 1
  • 31
  • 63
  • 1
    The fact that preserveWhitespace breaks the formatting functionality of XmlWriter is a crucial piece of information - this was tripping me up for quite a while. Thanks! – dwillis77 Apr 19 '21 at 20:26
1

If you have a string of XML, rather than a doc ready for use, you can do it this way:

var xmlString = "<xml>...</xml>"; // Your original XML string that needs indenting.
xmlString = this.PrettifyXml(xmlString);

private string PrettifyXml(string xmlString)
{
    var prettyXmlString = new StringBuilder();

    var xmlDoc = new XmlDocument();
    xmlDoc.LoadXml(xmlString);

    var xmlSettings = new XmlWriterSettings()
    {
        Indent = true,
        IndentChars = " ",
        NewLineChars = "\r\n",
        NewLineHandling = NewLineHandling.Replace
    };

    using (XmlWriter writer = XmlWriter.Create(prettyXmlString, xmlSettings))
    {
        xmlDoc.Save(writer);
    }

    return prettyXmlString.ToString();
}
MiniRagnarok
  • 959
  • 11
  • 23
theJerm
  • 4,482
  • 2
  • 30
  • 23
1

A more simplified approach based on the accepted answer:

static public string Beautify(this XmlDocument doc) {
    StringBuilder sb = new StringBuilder();
    XmlWriterSettings settings = new XmlWriterSettings
    {
        Indent = true
    };

    using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
        doc.Save(writer);
    }

    return sb.ToString(); 
}

Setting the new line is not necessary. Indent characters also has the default two spaces so I preferred not to set it as well.

d.i.joe
  • 606
  • 9
  • 22
0

Set PreserveWhitespace to true before Load.

var document = new XmlDocument();
document.PreserveWhitespace = true;
document.Load(filename);
cSharper
  • 3
  • 4