Say that I would encode a Java character array (char[]
) instance as bytes:
- using two bytes for each character
- using big endian encoding (storing the most significant 8 bits in the leftmost and the least significant 8 bits in the rightmost byte)
Would this always create a valid UTF-16BE encoding? If not, which code points will result in an invalid encoding?
This question is very much related to this question about the Java char type and this question about the internal representation of Java strings.