0

I'm taking over a previous developer's project. The code is building a large XML document a little at a time, appending new XML to the same variable as it goes along. The problem is the data sometimes has invalid characters.

For example:

$xml .= "<name>Jack & Jill</name>";
$xml .= "<street>123 Mary's Lake Road</street>";

I'm looking for a way to clean up the XML data all at once, like this:

$xml = safeXML($xml);

It seems like building such a function ought to be simple, but so far everything I've found either requires valid XML to start with, or else goofs up the XML tags themselves.

(I know it would be better to format the XML properly before building the document, but this is complex code and as usual the client wants it done yesterday. ;) )

Matthew
  • 521
  • 3
  • 13
  • How do you determine what's the correct usage of `&` or `<`, etc. inside your XML strings? Since you're not providing the context (i.e. escaping that which should be escaped), I'm not sure what rules you want `safeXML` to apply to determine what should be escaped and what should not be escaped. – MatsLindh May 15 '20 at 19:01
  • All special characters that are not actually part of XML tags should be converted. In my example code, the & and ' should be converted to & and ' respectively. – Matthew May 15 '20 at 19:11
  • Does this answer your question? [PHP - Processing Invalid XML](https://stackoverflow.com/questions/2890120/php-processing-invalid-xml) – Progman May 15 '20 at 20:25
  • Also see https://stackoverflow.com/questions/44765194/how-to-parse-invalid-bad-not-well-formed-xml, but the real solution is to fix the problem at the source. – Progman May 15 '20 at 20:25

0 Answers0