I am developing a Java application where I get a value of type char[]
from an external C++ dll.
There are cases in which non-ASCII values are expected to be input.
In such a case, it works normally when I construct a String
by only passing it a byte[]
which is converted from the hex-string interpretation of the input value.
On the other hand, I had problem when I construct a String
by passing a character array which is made up from a for-loop in which each byte
is cast to char
, one-by-one.
In the example below, a char[]
variable is obtained from the aforementioned dll where the input is a string with the value "çap" but comes with a hex-string value of C3A76170
.
// the StringUtil.toByteArray function converts hex-string to a byte array
byte[] byteArray = StringUtil.toByteArray("C3A76170");
Below example yields the expected result:
String s1 = new String(byteArray);
// print
System.out.println(s1)
çap
Below example does not yield the expected result:
char[] chars = new char[byteArray.length];
for (int i = 0; i < chars.length; i++) {
chars[i] = (char) byteArray[i];
}
String s2 = new String(chars);
// print
System.out.println(s2);
ᅢᄃap
In the second example, the output is "ᅢᄃap"
(where the character "ç" is apparently misinterpret as a different character).
What can cause this discrepancy between outputs? What is the reasoning behind this behavior?