I have a java program which handles xml files. Those files are in S1000D format, used for technical documentation. I need to update some meta data in the xml files and I am using SAXON to do so.
But Saxon is doing more transformations than the ones in my xsl.
- It auto closes the empty tags
- it interprets the HTML entities contained in the file.
Here is an extract of one of my input file :
<dmodule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.s1000d.org/S1000D_4-1/xml_schema_flat/schedul.xsd">
...
<reqSpares>
<noSpares></noSpares>
</reqSpares>
<reqSafety>
<noSafety></noSafety>
</reqSafety>
...
<timeLimit>
<remarks>
<simplePara>Lorem ipsum</simplePara>
<simplePara>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Vestibulum pulvinar sapien at lacus lacinia,
eu maximus arcu vestibulum.</simplePara>
</remarks>
</timeLimit>
...
And here is the result of my transformation:
<dmodule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.s1000d.org/S1000D_4-1/xml_schema_flat/schedul.xsd">
...
<reqSpares>
<noSpares/>
</reqSpares>
<reqSafety>
<noSafety/>
</reqSafety>
...
<timeLimit>
<remarks>
<simplePara>Lorem ipsum</simplePara>
<simplePara>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Vestibulum pulvinar sapien at lacus lacinia,
eu maximus arcu vestibulum.</simplePara>
</remarks>
</timeLimit>
...
Even if my xsl does not transform anything on those lines, they are transformed like so.
My requirements are that I do not have the permission to alter in whatsoever reason the structure or the content of the xml I am transforming like it is done in this example. The service that provides the input does not want to edit the input and add the entity declaration at the start of the xml file or encapsulate the html entities inside a CDATA tag.
In Saxon, we have tried:
- change encoding to US-ASCII
- replace & translate methods but as it is not on our transformed nodes, it does not work
- disable-encoding but as above, the changes are not done on our xsl transformations.
I also have looked into BaseX too but the problem is the same, and I am not an expert enough in this library to find if it is possible to achieve the behavior.
Any help would be appreciated !