UPDATE: Removed unnecessary DOM loading.
Use the XML transformer. It knows how to XML escape characters that are not supported by the given encoding.
Example
Transformer transformer = TransformerFactory.newInstance().newTransformer();
// Convert XML file to UTF-8 encoding
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(new StreamSource(new File("test.xml")),
new StreamResult(new File("test-utf8.xml")));
// Convert XML file to ISO-8859-1 encoding
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
transformer.transform(new StreamSource(new File("test.xml")),
new StreamResult(new File("test-8859-1.xml")));
test.xml (input, UTF-8)
<?xml version="1.0" encoding="UTF-8"?>
<test>
<english>Hello World</english>
<portuguese>Olá Mundo</portuguese>
<czech>Ahoj světe</czech>
<russian>Привет мир</russian>
<chinese>你好,世界</chinese>
<emoji> </emoji>
</test>
Translated by https://translate.google.com (except emoji)
test-utf8.xml (output, UTF-8)
<?xml version="1.0" encoding="UTF-8"?><test>
<english>Hello World</english>
<portuguese>Olá Mundo</portuguese>
<czech>Ahoj světe</czech>
<russian>Привет мир</russian>
<chinese>你好,世界</chinese>
<emoji>👋 🌎</emoji>
</test>
test-8859-1.xml (output, ISO-8859-1)
<?xml version="1.0" encoding="ISO-8859-1"?><test>
<english>Hello World</english>
<portuguese>Olá Mundo</portuguese>
<czech>Ahoj světe</czech>
<russian>Привет мир</russian>
<chinese>你好,世界</chinese>
<emoji>👋 🌎</emoji>
</test>
If you replace the test.xml
with the test-8859-1.xml
file (copy/paste/rename), you still get the same outputs, since the parser both auto-detects the encoding and unescapes all the escaped characters.