0

I have inout XML with following format

<Some_tag>
   <childTag>this is >25000</childTag>
</Some_tag>

the actual XML is very big with +200KB. I am reading this XML in Java. I am getting it as String.

What is solution to remove those > (Special char/escape char) from Tag values?

CodeDCode
  • 390
  • 5
  • 17
  • The above is invalid XML. `>` should be escaped to `>`. And any XML parser will transform back the escaped `>` to `>`, if that's your question. – JB Nizet Feb 04 '14 at 16:58
  • What exactly is the problem? That's well-formed XML as it stands (`<` must be escaped in character content but `>` is legal almost anywhere). – Ian Roberts Feb 04 '14 at 16:58
  • [documentation on >](http://www.w3.org/TR/REC-xml/#syntax) – McDowell Feb 04 '14 at 16:59
  • OK let me put my problem again.. end user is mannuly updating/uploading this XML, now for some tags they add those special char. And I am reading this XML using Java. so need a solution to trasate that <,> to > and <, I did transfer all the elements by replace with regex but did not found solution for this<>. – CodeDCode Feb 04 '14 at 20:27
  • Thanks, I stand corrected. – JB Nizet Feb 04 '14 at 22:49

2 Answers2

0

Use XSLT try something like this

XSLT to remove chars

Or use jaxb move all the xml to beans and remove the non needed chars from the String properties.

Community
  • 1
  • 1
Koitoer
  • 18,778
  • 7
  • 63
  • 86
  • let me test the xslt way. The JAXB way is way far for the app I am working. Its all legacy and Compex Schema failed several times to translate to Jaxb – CodeDCode Feb 04 '14 at 20:24
0

XML should be like

&lt;Some_tag&gt;
   &lt;childTag&gt;this is &gt;25000&lt;/childTag&gt;
&lt;/Some_tag&gt;

easy to parse your parser easily convert &lt; to < and &gt; to >
Girish
  • 1,717
  • 1
  • 18
  • 30