0

Hi I have a not valid xml "<samplexml> my text with & or < chars </samplexml>"

I want to convert it to a valid XML, replacing the special chars in text. So result will be:

"<samplexml> my text with &amp ; or &lt ; chars </samplexml>"

Do anyone knows some lib in Java that already solves this problem?

thx

ChrisH
  • 4,788
  • 26
  • 35
csviri
  • 1,159
  • 3
  • 16
  • 31
  • possibility of duplicate http://stackoverflow.com/q/4283351/668970 http://stackoverflow.com/q/3438854/668970 – developer May 09 '11 at 14:14

1 Answers1

0

Don't think of it as "not valid XML". Think of it as "not XML". If you're given some input text, you will have to write a parser for it. It's not XML, so you can't use an XML parser, you will have to write your own. Before you can do that, you need to define the syntax of the language that you want to parse. It wouldn't do any harm to do that by taking the grammar of XML, and using that as a basis for writing a grammar for the non-XML language that you want to accept.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Yes of course, this cannot be parsed as XML, and this is a very special case and has nothing to do with notion "valid" with classical xml sense. However would't be so hard to write a good enough and it wouldn't be so hard to write some program that fixes this issue. My question is rather that if someone didn't seen one since we want it very quick...sadly.. – csviri May 10 '11 at 11:01