0

I have to write console app that parses XML Doc two times. First it extracts some content and changes all &lt; and &gt; with <; >; Why? Because my company uses it's own language which looks like some horror creature and instead of < > it uses &lt; &gt; for divs and stuff This is not a problem Then I have to take generated file and reverse it - change < > back to &lt; &gt; When I try to load the file to reverse the process I receive and error that you can't start a line with ; or 0x3B. Is there any way to override it?

This how example line looks at the beginning &lt;img vE="|10416|2|3||" style="padding-top:153px;padding-left:266px;" imageIndex="312" /&gt;

This is an edited one <;img vE="|10416|2|3||" style="padding-top:153px;padding-left:266px;" imageIndex="312" />; And I need to have possibility to revert the process

k1t3k
  • 3
  • 2
  • You are breaking an XML document just to ask us how to fix it? If they use < and > in XML, they shall properly escape them so that it remains wellformed XML – Thomas Weller Aug 29 '22 at 10:21
  • 1
    So, after the edit it looks like they do it right (< and > being escapedto < and >) and *you* don't know how to process XML. – Thomas Weller Aug 29 '22 at 10:23
  • `<` is the HTML/XML-encoded form of `<`, not `<;`, just as ` ` is the encoded form of ` `, a single space. – Panagiotis Kanavos Aug 29 '22 at 10:36
  • It's not a normal XML. It's an XML doc that is written in inhouse "language" that is later used on an embedded system. It's pretty ugly and all lines start with < and end with > instead of < > and some other things that make it hard to work with. There are also few lines that have to begin with < so I cant parse them to start with < and some variables in text are variable=false instead of variable="false" so the XML loader crashes on these if I use < > – k1t3k Aug 29 '22 at 10:36
  • The *snippet* you posted is *not* something custom. It's just an XML-encoded string. The duplicate shows how to decode it before parsing. What does the full document look like? Are you sure you aren't seeing one XML document included inside another? – Panagiotis Kanavos Aug 29 '22 at 10:38
  • It start with set of tags looking like this ```content``` so nothing special. then lines themselves ```<linecontent>``` there are also sections divided with ``` ``` - where n is section number. and then tags again at the end ```content``` – k1t3k Aug 29 '22 at 10:42
  • "This how example line looks at the beginning" - is that the first line of the "XML" document? If so, it's not XML. It looks mor like escaped HTML. Why don't you discuss that in your company? They should be able to tell you how exactly this data needs to be processed or is processed in other systems. – Thomas Weller Aug 29 '22 at 11:28

0 Answers0