0

Whenever we are passing a message in language Kannada Language only a certain character <U+200C> gets added in the message, which is eventually send via SMS. we are using UTF-8 encoder and decoder. while testing raw string it is asserting same value for rawString and decodedString.

Actual Value :

ಕ್<U+200C>ಬ

Expected Value :

ಕ್‌ಬ
  • 1
    "The zero-width non-joiner (ZWNJ) is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively. This is also an effect of a space character, but a ZWNJ is used when it is desirable to keep the words closer together or to connect a word with its morpheme." https://unicode-explorer.com/c/200C – g00se Mar 08 '22 at 10:09
  • s = s.replaceAll("\u200c", ""); if all else fails – g00se Mar 08 '22 at 10:34

0 Answers0