0

Adding a text that has the unicode "\u000b" will throw an error when you open the saved docx.

I am using docx4j-JAXB-Internal 8.1.6. The same issues occurs with "\u000c" so I assumed it happens with all Control Characters. At the moment I am simply removing them with:

        String removeControlCharacters = this.content.replaceAll("\\p{Cc}", "");

and then adding it a Text and it works fine.

I am wondering if anyone else has had this issue and what the best way to resolve it is.

bcgilmartin
  • 29
  • 1
  • 4
  • U+000B and U+000C are two of several characters that can't appear in XML, even if escaped. See: https://stackoverflow.com/questions/730133/what-are-invalid-characters-in-xml ; see also: https://www.w3.org/TR/xml/#dt-charref. – Peter O. Dec 15 '20 at 22:33
  • open the file with notepad and set the encoding to UTF without BOM – aran Dec 15 '20 at 22:39

0 Answers0