0

I'm trying to load a XML file coming from a web request. The response is encoded in Base64String so I have to decode it first.

    XmlDocument ResultXML = new XmlDocument();
    ....
    // encPayload is the string returning from web request
    byte[] data = Convert.FromBase64String(encPayload);
    string decodedString = Encoding.UTF8.GetString(data);
    ResultXML.LoadXml(decodedString);

The decoded string contains the XML that I want to load, but some values contains illegal characters (i.e '<', '>'), so I have to remove them before I can call XmlDocument LoadXml function. The decoded string can reach about 60/80 MB, so if I try to use Replace method I have OutOfMemoryException. How can I fix this problem? Thanks

Rand Random
  • 7,300
  • 10
  • 40
  • 88
  • 1
    Is it feasible to change the source to simply give you valid XML to start with? In general, I don't trust "replacements" to sanitize XML - if they're producing invalid XML in one way today, they could be producing invalid XML in a different way tomorrow. – Jon Skeet Nov 21 '17 at 13:07
  • No, I can't change the source because it is a remote server response. –  Nov 21 '17 at 13:20
  • 1
    Your root problem how to parse bad "XML". Even if your decoded data were small enough to do lexical search-and-replace, you still wouldn't know how to differentiate between a `<` in text and a `<` that starts markup without parsing. Therefore, I'm closing this as a duplicate of [**How to parse invalid (bad / not well-formed) XML?**](https://stackoverflow.com/q/44765194/290085) – kjhughes Nov 21 '17 at 13:37
  • 1
    My point is that you should get in touch with whatever organization is producing the server response, and let them know that it's invalid, assuming it's meant to be correct XML. – Jon Skeet Nov 21 '17 at 13:38

0 Answers0