42

I want to be able to write XML to a String with the declaration and with UTF-8 encoding. This seems mighty tricky to accomplish.

I have read around a bit and tried some of the popular answers for this but the they all have issues. My current code correctly outputs as UTF-8 but does not maintain the original formatting of the XDocument (i.e. indents / whitespace)!

Can anyone offer some advice please?

XDocument xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), xelementXML);

MemoryStream ms = new MemoryStream();
using (XmlWriter xw = new XmlTextWriter(ms, Encoding.UTF8))
{
    xml.Save(xw);
    xw.Flush();

    StreamReader sr = new StreamReader(ms);
    ms.Seek(0, SeekOrigin.Begin);

    String xmlString = sr.ReadToEnd();
}

The XML requires the formatting to be identical to the way .ToString() would format it i.e.

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<root>
    <node>blah</node>
</root>

What I'm currently seeing is

<?xml version="1.0" encoding="utf-8" standalone="yes"?><root><node>blah</node></root>

Update I have managed to get this to work by adding XmlTextWriter settings... It seems VERY clunky though!

MemoryStream ms = new MemoryStream();
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = Encoding.UTF8;
settings.ConformanceLevel = ConformanceLevel.Document;
settings.Indent = true;
using (XmlWriter xw = XmlTextWriter.Create(ms, settings))
{
    xml.Save(xw);
    xw.Flush();

    StreamReader sr = new StreamReader(ms);
    ms.Seek(0, SeekOrigin.Begin);
    String blah = sr.ReadToEnd();
}
Peter
  • 3,916
  • 1
  • 22
  • 43
Chris
  • 26,744
  • 48
  • 193
  • 345
  • What 'formatting'? You haven't said anything about formatting! – AakashM Oct 06 '10 at 10:58
  • The usual whitespace / formatting that you get if you just to a `.ToString()` on an `XDocument` or `XElement` – Chris Oct 06 '10 at 10:59
  • Please give a sample input document so we can test answers. – Jon Skeet Oct 06 '10 at 11:03
  • @John - Done... It is just the whitespace formatting I am bothered about as I later hash the XML so need to be 100% sure the output is consistent. – Chris Oct 06 '10 at 11:07
  • I've provided a rather simpler way of doing it. – Jon Skeet Oct 06 '10 at 11:11
  • Title and approved answer suggest this is about UTF-8 versus UTF-16, but his own solution shows it is about formatting / pretty-printing instead. Instead of the settings, he could simply do `xw.Formatting=Formatting.Indented` . – Roland Sep 14 '16 at 10:57
  • A related thread is [How to print `` using `XDocument`](https://stackoverflow.com/questions/957124/). – Jeppe Stig Nielsen Nov 16 '18 at 14:58

3 Answers3

74

Try this:

using System;
using System.IO;
using System.Text;
using System.Xml.Linq;

class Test
{
    static void Main()
    {
        XDocument doc = XDocument.Load("test.xml",
                                       LoadOptions.PreserveWhitespace);
        doc.Declaration = new XDeclaration("1.0", "utf-8", null);
        StringWriter writer = new Utf8StringWriter();
        doc.Save(writer, SaveOptions.None);
        Console.WriteLine(writer);
    }

    private class Utf8StringWriter : StringWriter
    {
        public override Encoding Encoding { get { return Encoding.UTF8; } }
    }
}

Of course, you haven't shown us how you're building the document, which makes it hard to test... I've just tried with a hand-constructed XDocument and that contains the relevant whitespace too.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Works a treat, thanks - is there no way to get the encoding sorted without inheriting from StringWriter? – Chris Oct 06 '10 at 11:12
  • @Chris: It's *possible* that there is some way of getting the TextWriter overload to ignore the encoding that the TextWriter advertises, but I've found this to be a really simple hack to get the job done. (You only need it in one place...) – Jon Skeet Oct 06 '10 at 11:13
  • 1
    Yeah I like it - it's FAR better than the method I came up with. Thanks – Chris Oct 06 '10 at 11:15
1

Try XmlWriterSettings:

XmlWriterSettings xws = new XmlWriterSettings();
xws.OmitXmlDeclaration = false;
xws.Indent = true;

And pass it on like

using (XmlWriter xw = XmlWriter.Create(sb, xws))
KMån
  • 9,896
  • 2
  • 31
  • 41
0

See also https://stackoverflow.com/a/3288376/1430535

return xdoc.Declaration.ToString() + Environment.NewLine + xdoc.ToString();
Polluks
  • 525
  • 2
  • 8
  • 19