(NB: the original question title was: What is the best way to load XML from a string with a document specification?)
I need to get the XML content from an ODT opendocument (LibreOffice) file in an XmlDocument object. The ODT is a zip archive and I managed to get the content.xml part as a byte array. Converting to a string seems simple, but I was surprised to find that XmlDocument.LoadXml(string) does not accept a string that starts with an Xml document specification line, like:
<?xml version="1.0" encoding="UTF-8"?>
<Offices id="0" enabled="false">
<office />
</Offices>
The exception is: Data at the root level is invalid. Line 1, position 1
I wonder if there is a library call to read such a string?
For now I use this function I improvised, but it feels unnecessarily complex to have to do stuff on the character level when handling xml documents:
/// <summary>
/// Convert an Xml document in a string, including document specification line(s),
/// to an XmlDocument object
/// </summary>
/// <param name="XmlString"></param>
/// <returns></returns>
public static XmlDocument LoadXmlString(string XmlString)
{
XmlDocument XmlDoc = new XmlDocument();
XmlDoc.LoadXml(XmlString.Substring(XmlString.LastIndexOf("?>") + 2));
return XmlDoc;
}
Is there a better way?
NB: I refer to this earlier question
but this addresses the problem of parsing a string, with the solution of converting the string to a byte array, while I should not be parsing the string, and not convert the byte array to string to begin with, but just skip this step and directly parse the byte array after unzipping the ODT.