I am using the following routine to format an XML file:
public static string FormatXml(string xml, bool clean = true)
public static string FormatXml(string xml, bool clean = true)
{
if (xml.Trim().Length == 0)
{
return "";
}
var stringBuilder = new StringBuilder();
try
{
string modifiedXml = xml;
if (clean)
{
modifiedXml = CleanXml(xml);
}
var element = XElement.Parse(modifiedXml);
var settings = new XmlWriterSettings();
settings.OmitXmlDeclaration = true;
settings.Indent = true;
settings.NewLineOnAttributes = true;
using (var xmlWriter = XmlWriter.Create(stringBuilder, settings))
{
element.Save(xmlWriter);
}
return stringBuilder.ToString();
}
catch (Exception e)
{
//MessageBox.Show(e.Message);
return xml;
}
return xml;
}
But this routine chokes when it tries to format an XML file that did not encode the ampersand in the name property as &
<process id="702fe4d7-f312-49b9-959e-5cc8a421d38a" name="108_CareAllies_18&23_DSA & HV_ServiceOpsReport_Weekly" xmlns="http://www.blueprism.co.uk/product/process">
I get this error:
"'\"' is an unexpected token. The expected token is ';'. Line X, position Y." (which points to the ampersand position.) I don't have much experience parsing XML and I see myself spending a lot of time to come up with a routine to replace these occurrences with their encoded equivalents before calling the above routine.
I am looking for an efficient way to format many large XML files. Is there an easy and fast way to format XML files that have special characters in them? I