I look for a way to beautify incomplete XML documents. In best case it should handle even large sizes (e.g. 10 MB or maybe 100 MB).
Incomplete means that the documents are truncated at a random position. Until this position the XML has a valid syntax. Beautify means to add line breaks and leading spaces between the tags.
In my case it's needed to analyse aborted streams. Without line breaks and indentions it's really hard to read for a human. I know there are some editors which can beautify incomplete documents, but I want to integrate the beautifier into my own analysis tool.
Unfortunately I did't find a discussion or solution for that case.
The nuget package GuiLabs.Language.Xml
of Kirill Osenkov (repository XmlParser) seems to be a useful candidate for an own beautifier implementation, because it's designed to be error tolerant. Unfortunately there is too less documentation to understand how to use this parser.
Example xml:
<?xml encoding="UTF-8"?><X><B><C>aa</C><B/><A.B><X>bb</X></A.B><A p="pp"/><nn:A>cc</nn:A><D><E>eee</
Expected result as string:
<?xml encoding="UTF-8"?>
<X>
<B>
<C>aa</C>
<B/>
<A.B>
<X>bb</X>
</A.B>
<A p="pp"/>
<nn:A>cc</nn:A>
<D>
<E>eee</