Here's the problem. When you do this:
new String(byteArray,"UTF-8")
you are saying to the runtime system this:
The byte array contains character data that has been encoded as UTF-8. Convert it into a sequence of Unicode codepoints1 and give them to me as a Java String
.
But the bytes in the byte array are clearly NOT a well-formed UTF-8 sequence, because you are getting stuff that looks like garbage.
So what is going on? Well I think that there are two possibilities:
The bytes in the array could actually be characters in a different character encoding. It is clearly not ASCII data because pure 7-bit ASCII is also well-formed as UTF-8. But the bytes could be encoded in some other character encoding. (If we actually had the byte values, we might be able to make an educated guess as to which encoding was used.)
The bytes in the array could actually be garbled. You say that they were obtained by decrypting AES encrypted data. But if you somehow got the decryption incorrect (e.g. you used the wrong key), then you would end up with garbled stuff.
Finally, the closest equivalent in Java to std::string str(byteArray,byteArray+len)
is this:
new String(byteArray, "LATIN-1")
This is because each encoded byte in an LATIN-1 sequence is equal in value to the equivalent Unicode code point.
Whether it is unclear whether that would actually work in your case. Certainly, it won't work if the bytes were garbled due to an incorrect encryption or decryption. Or garbling of the encrypted data in transmission.
1 - actually, UTF-16 code units ... but that's another story.