I need to read in a xml file that isn't conform the xml rule's. So i need to make it right before i can read it as a xml file. It exist of symbols like "&" en "<" between the elements.
<MAT>
<MATERIAL><MATNR>2286303</MATNR><BESTELTXT>Parts for something & something else</BESTELTXT><WERKS>Material exist out of<1 something</WERKS>
</MAT>
For now i have this:
I read in the file then i do this
text = Regex.Replace(text, @"\s&\s", " & ");
text = Regex.Replace(text, @"[<]\d+", "<");
After it i write the text to file and this i read in as xml.
The problem with "<" is that it is removing the number and this i need to keep. Also i don't know if this is having a good performance? Also will this work with verry large file's? And it also only matches this case but what if we have in the future more case's? Isn't there a general way for changing those Predefined entities to their xml format?
ps: I know this should be handled when the xml file is made but it's coming from a thirth party and they can't change it.