1

I have an XmlDocument and I am saving it with an XmlWriter, using this post. Despite setting the Encoding to Utf-8 and the file getting saved with Utf-8 encoding in fact, the xml declaration in the file has the "utf-16" as the value of the encoding attribute.

I can't see where is the error in my code:

StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
    Encoding=Encoding.UTF8
};
using (XmlWriter writer = XmlWriter.Create(sb, settings))
{
    xDoc.Save(writer);
}
using (
    StreamWriter sw = new StreamWriter(
        new FileStream(strXmlName, FileMode.Create, FileAccess.Write),
        Encoding.UTF8
    )
)
{
    sw.Write(sb.ToString());
}
ib11
  • 2,530
  • 3
  • 22
  • 55
  • okay, after testing it myself, nope it's totally ignoring the UTF8 settings on the writer and still uses the utf16 encoding (you can verify that yourself for the saved files), there are few workarounds that you can find in [this question](https://stackoverflow.com/questions/9459184/why-is-the-xmlwriter-always-outputting-utf-16-encoding) – Andrew Feb 18 '19 at 06:03
  • 1
    Use `Utf8StringWriter` from [this answer](https://stackoverflow.com/a/3862106) to [Serializing an object as UTF-8 XML in .NET](https://stackoverflow.com/q/3862063) and also the accepted answer to [Force XDocument to write to String with UTF-8 encoding](https://stackoverflow.com/a/3871822). – dbc Feb 18 '19 at 06:48
  • In fact I think this question is a duplicate of those two; agree? – dbc Feb 18 '19 at 06:54
  • The source xml file that you are reading in contains the utf-16 attribute and nothing in your code is changing the attribute. So when you write the document you are getting the same attribute that you read. – jdweng Feb 18 '19 at 09:51
  • @jdweng the declaration is written by `XmlWriter`, and the value of its encoding will depend on its configuration. [The question @dbc links to](https://stackoverflow.com/questions/3862063/serializing-an-object-as-utf-8-xml-in-net) explains why this is always UTF-16 in this case. – Charles Mager Feb 18 '19 at 13:38

1 Answers1

2

The reason for this is covered in the question @dbc links to in the comments: The overload of XmlWriter.Create that accepts a StringBuilder will create a StringWriter, which has its encoding set to UTF-16.

However, in this case it's not clear why you're using a StringBuilder when your goal is to write to a file. You could create an XmlWriter for the file directly:

var settings = new XmlWriterSettings
{
    Indent = true
};

using (var writer = XmlWriter.Create(strXmlName, settings))
{
    xDoc.WriteTo(writer);
}

The encoding here will default to UTF-8.

As an aside, I'd suggest you check out the much newer XDocument and friends, it's a much more friendly API than XmlDocument.

Charles Mager
  • 25,735
  • 2
  • 35
  • 45
  • Thank you. What is the actual difference between `xDoc.Save()` and `xDoc.WriteTo()`? – ib11 Feb 19 '19 at 00:58
  • 1
    @ib11 - see [this comment](https://stackoverflow.com/questions/750198/convert-xdocument-to-stream#comment2331887_750250). – dbc Feb 19 '19 at 09:47