I'm debugging a third-party gateway system which translates binary messages into an XML webservice. There is an issue when it receives messages containing special characters 0x80, 0x81, 0x82 and 0x83 they are not sent as XML correctly.
I've narrowed down the problem to where they convert byte[] to String and produced a simple example of what's going wrong. The special values all get translated to the same "unknown" character.
public static void main(String[] args) {
test(0x80);test(0x81);test(0x82);test(0x83);
}
public static void test(int value) {
String message = new String(new byte[]{(byte)value});
System.out.println(value + " => " + message + " => " + Arrays.toString(message.getBytes()));
}
Output
128 => � => [-17, -65, -67]
129 => � => [-17, -65, -67]
130 => � => [-17, -65, -67]
131 => � => [-17, -65, -67]
I'm wondering how this should be fixed. I've tried changing their code to use an explicit character set
new String(bytes, Charset.forName("UTF-8"))
However this results in same problem. The values 0x80-0x83 don't seem to exist as valid XML entities.
I've found you can use the character constructor which kind of works, but translates the following, which I'm not sure is correct??
new String(new char[]{(char) value}, 0, 1);
Output
128 => weird box character 0080 => [-62, -128]
129 => weird box character 0081 => [-62, -127]
130 => weird box character 0082 => [-62, -126]
131 => weird box character 0083 => [-62, -125]