1

I have a large (2+ GB) XML file that the contents look like:

< rootNode>
   <content>
      & lt;node& gt;test& lt;/node& gt;
   </content>
< /rootNode>

The node contains the entire xml structure I need, but is an encoded xml string.

When looping through this with an XmlReader, the "& lt;node& gt;test& lt;/node& gt;" comes back as a string and not the individual xml elements that I want to iterate.

Is there an efficient way of decoding the node? So I only have to iterate through the file once?

Thanks, brian

B.McCarthy
  • 123
  • 1
  • 1
  • 13

1 Answers1

1

Correct, you will have to iterate through the file at least once to decode the file.

I recommend HtmlDecode.

Lucas B
  • 11,793
  • 5
  • 37
  • 50
  • If I use HtmlDecode, I have to have the entire encoded string in memory, which could be very large. If I iterate through the file, is there a way to ensure I don't try to decode "& lt;test&g" and miss the "t;"? – B.McCarthy Jan 16 '13 at 22:15
  • +1. @B.McCarthy, unfortunately there is no way to ensure that you (or anyone) write code without bugs if that is what you are asking. Otherwise check documentation on how encoding works and read each character and see if it is part of encoded value of should be passed through as is. Optimize later if needed. – Alexei Levenkov Jan 16 '13 at 22:23