Escaping of Unicode characters is not modeled or controlled by XmlDocument
. Instead, XmlWriter
will escape characters in character data and attribute values not supported by the current encoding, as specified by XmlWriterSettings.Encoding
, at the time the document is written to a stream. If you want all "special characters" such as the En Dash to be escaped, choose a very restrictive encoding such as Encoding.ASCII
.
To do this easily, create the following extension methods:
public static class XmlSerializationHelper
{
public static string GetOuterXml(this XmlNode node, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false)
{
if (node == null)
return null;
using var stream = new MemoryStream();
node.Save(stream, indent : indent, encoding : encoding, omitXmlDeclaration : omitXmlDeclaration, closeOutput : false);
stream.Position = 0;
using var reader = new StreamReader(stream);
return reader.ReadToEnd();
}
public static void Save(this XmlNode node, Stream stream, bool indent = false, Encoding encoding = null, bool omitXmlDeclaration = false, bool closeOutput = true) =>
node.Save(stream, new XmlWriterSettings
{
Indent = indent,
Encoding = encoding,
OmitXmlDeclaration = omitXmlDeclaration,
CloseOutput = closeOutput,
});
public static void Save(this XmlNode node, Stream stream, XmlWriterSettings settings)
{
using (var xmlWriter = XmlWriter.Create(stream, settings))
{
node.WriteTo(xmlWriter);
}
}
}
And now you will be able to do the following to serialize an XmlDocument
to a string with non-ASCII characters escaped:
// Construct your XmlDocument (not shown in the question)
var xmlDoc = new XmlDocument();
xmlDoc.LoadXml("<Root></Root>");
var eqnPartElm = xmlDoc.CreateElement("inf");
xmlDoc.DocumentElement.AppendChild(eqnPartElm);
// Add some non-ASCII text (here – is an En Dash character).
eqnPartElm.InnerText = "–CO–OR";
// Output to XML and escape all non-ASCII characters.
var xml = xmlDoc.GetOuterXml(indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
To serialize to a Stream
, do:
using (var stream = new FileStream(fileName, FileMode.OpenOrCreate))
{
xmlDoc.Save(stream, indent : true, encoding : Encoding.ASCII, omitXmlDeclaration : true);
}
And the following XML will be created:
<Root>
<inf>–CO–OR</inf>
</Root>
Notes:
You must use the new XmlWriter
not the old XmlTextWriter
as the latter does not support replacing unsupported characters with escaped fallbacks.
Some parts of an XML document, including element and attribute names and comment text, do not support inclusion of character entities. If you attempt to write an unsupported character in such a situation, XmlWriter
will throw an exception.
Demo fiddle here.