Update: Context is MuleSoft and could any libs be used to solve scenarios like this.
I have an unusual requirement in that I need to accept 'Incorrect XML' within an API implementation and essentially correctly escape any control characters in areas of the XML where they should not be, i.e in attributes or on the element data, of which they can occur anywhere.
This is to prevent APIKit/Schema validation errors initially, as well as further DW transforms that will expect valid XML.
Tried to portray a simple example below:
<CARS>
<CAR>
<MODEL ALIAS="City & Co">alpha city</MODEL>
<YEAR>1992</YEAR>
<MANAFACTURER>Penguin</MANAFACTURER>
<OTHER>Made in UK & US</OTHER>
</CAR>
<CAR>
<MODEL ALIAS="City & Co" MAKE="BMW">venturi city</MODEL>
<YEAR>1994</YEAR>
<MANAFACTURER>Penguin</MANAFACTURER>
<OTHER>BHP > 1000</OTHER>
</CAR>
</CARS>
Is there any easy to parse XML in DW or external lib and essentially correctly escape control characters like & and < >?