2

We want to send emails where customer name appears from a third party system. Those names now & then contains non english character. For example :

We form email request body in XML using java. Can someone please help what type of character is this and what type of encoding required to handle? If we use StringEscapeUtils.escapeXml(string) method that causes request xml to become not parsable.

Sajaru
  • 21
  • 3
  • 1
    What do you mean? The output (with commons-text-1.8) parses correctly by `xmllint` when inserted into an XML. – choroba Jul 22 '20 at 15:28

1 Answers1

1

Those are some of the 143,859 characters defined in Unicode.

The , for example, is a "mathematical bold script capital j", 0x1D4D9 in hex, 120025 in decimal, 𝓙 as an character entity.

For details on handling Unicode in Java, see

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
kjhughes
  • 106,133
  • 27
  • 181
  • 240
  • 1
    And more generally: [*The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)*](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/) by Joel Spolsky. – Basil Bourque Jul 22 '20 at 20:08