0

I'm trying to use forge.util.decodeUtf8 in javascript but I have not the same result with java, can you help me ?

var a = forge.util.hexToBytes("037A4078C3AD65C38863226AC3BD64C2B5392C6CCB8646617075342B473079C3954FC2A553C3BE6D");
aDecoded = forge.util.decodeUtf8(a);
console.log(forge.util.bytesToHex(aDecoded);
>> 037a4078ed65c863226afd64b5392c6c2c646617075342b473079d54fa553fe6d

This is the result in java

byte[] a = hexToBytes("037A4078C3AD65C38863226AC3BD64C2B5392C6CCB8646617075342B473079C3954FC2A553C3BE6D");

String aDecoded = new String(a, Charset.forName("UTF-8"));
byte[] r = test.getBytes();

System.out.println(bytesToHex(r));
>> 037A4078ED65E863226AFD64B5392C6C8866617075342B673079F56FA573FE6D

The difference in the result it's here

037a4078ed65c863226afd64b5392c6c**2c64**6617075342b473079d54fa553fe6d
037A4078ED65E863226AFD64B5392C6C**8866**6617075342B673079F56FA573FE6D

I don't understand why I have different result.

Mark Rotteveel
  • 100,966
  • 191
  • 140
  • 197
Adrien Leloir
  • 513
  • 4
  • 8
  • In the Java statement `byte[] r = test.getBytes();` you are using the variable `test` which has not been declared in your Java code sample. – skomisa Feb 03 '20 at 06:48
  • You should not use `test.getBytes()`, always explicitly specify the character set, eg `test.getBytes(StandardCharsets.UTF_8)`, otherwise it might use a different character set than you expect. – Mark Rotteveel Feb 03 '20 at 10:08

1 Answers1

2

The posted byte sequence can be decoded into a UTF8 string, as can be easily verified using a UTF8 table. Note, however, that generally an arbitrary byte sequence cannot be converted into a UTF8 string, here.

The Forge code contains a bug that causes the wrong result: The hexadecimal string is converted into a binary encoded string of bytes with hexToBytes and decoded into a UTF8 string with decodeUtf8. For the reverse, the UTF8 string must first be encoded into a binary encoded string of bytes with encodeUtf8 and converted into a hexadecimal string with bytesToHex. In the posted Forge code the encoding with encodeUtf8 is missing. With

console.log(forge.util.bytesToHex(forge.util.encodeUtf8(aDecoded))); 

the correct result is displayed.

There are two minor issues in the Java code: The variable test isn't defined and must be replaced by the variable aDecoded. Furthermore, when calling the method getBytes, the UTF8 encoding should be specified, otherwise the default platform charset is used, which is a possible error source. Apart from that, the Java code seems to be correct. However, since the two methods hexToBytes and bytesToHex haven't been posted, it cannot be ruled out that the error may be here. If e.g. this implementation is used for hexToBytes and this implementation for bytesToHex, the correct result is displayed.

Topaco
  • 40,594
  • 4
  • 35
  • 62